Message boards : Sixtrack Application : The Sixtrack Application
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5595 - Posted: 5 Nov 2018, 16:58:58 UTC
Last modified: 5 Nov 2018, 17:58:48 UTC

Sorry,
but - Crystal is it possible to transfer messages No.
5588, 5590 and 5591
to LHCb-thread.
This is a sixtrack-thread.
Thank you.

BTW One CPU and ONE Task are running for sixtrack without ABORT from Server!
Edit:
Some tasks are also aborted, message in Boinc:
[error] garbage_collect() still have active task for acked result Sixtrack....
ID: 5595 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5596 - Posted: 5 Nov 2018, 20:05:29 UTC - in response to Message 5595.  
Last modified: 5 Nov 2018, 20:19:39 UTC

When a task started than after 57 seconds is a new request for a new task.
At this moment, some tasks where interrupt and failed.
Is it possible to eliminate this request?
Boinc wrote:
Requesting new tasks for CPU
Scheduler request completed got 0 new task
No tasks sent
No tasks are avalaible for Sixtrack Simulation
This computer has reached a limit on tasks in progress.

Using One task and ONE cpu in preferences.
ID: 5596 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 5597 - Posted: 5 Nov 2018, 20:27:48 UTC - in response to Message 5595.  

Sorry,
but - Crystal is it possible to transfer messages No.
5588, 5590 and 5591
to LHCb-thread.
This is a sixtrack-thread.
Thank you.

Sorry, can't do that.
I'm not a moderator and also don't wanna be here.
Laurence or Nils could do that, but cause this project is only for development/testing 'normal'-not testing users should spend their CPU-power at the LHC production project.
'We' testers should be aware of several kind of failures of the software, no credit given or even crashing of your testing client machine.
ID: 5597 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 5598 - Posted: 5 Nov 2018, 20:30:42 UTC - in response to Message 5595.  

BTW One CPU and ONE Task are running for sixtrack without ABORT from Server!

Even when only 1 task receiving, that one is also cancelled by the server, when it is not yet in a running state and you're requesting more work.
ID: 5598 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5599 - Posted: 5 Nov 2018, 20:40:41 UTC - in response to Message 5598.  
Last modified: 5 Nov 2018, 20:48:33 UTC

My understanding for ONE task is:
Boinc load a new task AFTER uploading the finished task.
I see this in Boinc for the moment.
Why is a new schedule request for a task after ONE Minute of work?
Come this from the Server?
When this request is simultan in many Computer if they start the task at the same time
than goodnight ;-)) - a lot of traffic for a poor Server.
ID: 5599 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5600 - Posted: 6 Nov 2018, 9:43:46 UTC

When Network-Adapter is interrupted after download
no new work is checked from the Server.
Sixtrack downloaded 4 Tasks (Preferences are 7 Tasks).
This is a infrastructure-problem and no Boinc-schedule error.
Will test it over some hours. Computer-ID 1164.
ID: 5600 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 5601 - Posted: 6 Nov 2018, 12:09:55 UTC

Another major misbehavior of this SixTrack version.
When running for some time and aborts of running tasks by the server, the memory and swapfile is
rapidly decreasing until only 1 or 2 tasks are really running on a 14-core system.
The aborted tasks of course disappear in BOINC, but the memory is not freed up.
To show you when only 1 task is busy and all other 13 cores are idle the boinc-processes and their allocated memory:
top - 13:01:42 up 13:44,  1 user,  load average: 1.48, 5.39, 8.90
Tasks: 168 total,   3 running, 165 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  7.2%ni, 92.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   9725296k total,  5097936k used,  4627360k free,     7332k buffers
Swap:  2097148k total,  1907536k used,   189612k free,    24880k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6427 boinc     39  19  446m 330m 1996 R  100  3.5   6:33.56 sixtrack_lin64_
 1224 boinc     30  10  118m  11m 2584 S    0  0.1   2:21.07 boinc
 5462 boinc     39  19  510m 2152 2140 S    0  0.0   0:14.68 sixtrack_lin64_
 5736 boinc     39  19  510m  80m 2328 S    0  0.9   1:00.64 sixtrack_lin64_
 5738 boinc     39  19  510m 330m 2332 S    0  3.5   1:00.62 sixtrack_lin64_
 5742 boinc     39  19  510m 2456 2332 S    0  0.0   1:00.63 sixtrack_lin64_
 5750 boinc     39  19  510m 330m 2332 S    0  3.5   1:00.50 sixtrack_lin64_
 5814 boinc     39  19  510m 2452 2328 S    0  0.0   0:19.39 sixtrack_lin64_
 5818 boinc     39  19  510m 330m 2328 S    0  3.5   0:24.51 sixtrack_lin64_
 5830 boinc     39  19  510m 2344 2328 S    0  0.0   0:22.45 sixtrack_lin64_
 5834 boinc     39  19  510m  91m 2332 S    0  1.0   0:17.18 sixtrack_lin64_
 5842 boinc     39  19  510m 294m 2328 S    0  3.1   0:23.40 sixtrack_lin64_
 5852 boinc     39  19  510m 330m 2332 S    0  3.5   0:16.35 sixtrack_lin64_
 5864 boinc     39  19  510m 330m 2260 S    0  3.5   0:19.43 sixtrack_lin64_
 5870 boinc     39  19  510m 330m 2332 S    0  3.5   1:00.75 sixtrack_lin64_
 5876 boinc     39  19  510m 330m 2328 S    0  3.5   0:23.50 sixtrack_lin64_
 5881 boinc     39  19  510m 330m 2332 S    0  3.5   0:57.05 sixtrack_lin64_
 5889 boinc     39  19  510m 324m 1764 S    0  3.4   0:13.61 sixtrack_lin64_
 5895 boinc     39  19  510m 330m 2332 S    0  3.5   0:50.08 sixtrack_lin64_
 5903 boinc     39  19  510m 325m 2152 S    0  3.4   0:14.32 sixtrack_lin64_
 5909 boinc     39  19  510m 330m 2340 S    0  3.5   0:54.34 sixtrack_lin64_
ID: 5601 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5602 - Posted: 6 Nov 2018, 12:14:51 UTC

With or without the Boinc-option:
report_results_immediately (0/1)
13 Download-Error and 35 sixtrack-tasks running in three hours.
It seem a network-traffic problem in the infrastructure to be.
Boinc is proofing every Minute more than 10 times for Network-Connection.
It is to difficult to find a good answer from the Client-Side.
This test is with OpenSuse 13.2 and Boinc 7.2.42. A very stable Linux.
ID: 5602 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 5603 - Posted: 6 Nov 2018, 12:25:16 UTC - in response to Message 5601.  

This is more likely a BOINC client issue than a SixTrack issue.
Could you check if the SixTracks disappear if you quit your BOINC client?
ID: 5603 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 5604 - Posted: 6 Nov 2018, 12:36:37 UTC - in response to Message 5603.  

This is more likely a BOINC client issue than a SixTrack issue.
Could you check if the SixTracks disappear if you quit your BOINC client?

No, I already checked that yesterday.
When shutting down the BOINC-client the sixtrack processes don't disappear and memory and swap keeps allocated.
The fastest and cleanest solution here is: $ reboot
ID: 5604 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5605 - Posted: 6 Nov 2018, 12:50:02 UTC - in response to Message 5602.  

Four tasks are downloaded.
During the time ONE is running, every Minute the other three where killed and new three tasks are downloaded!
Finishing my test for sixtrack now!
ID: 5605 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 5606 - Posted: 6 Nov 2018, 12:58:10 UTC - in response to Message 5604.  

The normal way to end a child process would be to send a TERM signal first (which can be trapped/ignored by the child) and to send a KILL signal after a grace period.
The latter can't be trapped and tells the kernel to immediately destroy the affected process.
As I doubt there is a general kernel problem and due to the fact that the cancellation is initiated by the client I still think it's a BOINC client issue.

May be that the client does only send the TERM signal.
One way to check this would be to read the source code.
ID: 5606 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5607 - Posted: 6 Nov 2018, 13:43:24 UTC

18/11/5 8:30 UTC is the beginning of the Problems with Server cancelled tasks.
The time before, all tasks are running without those errors.
Is there a Server-status seen?
ID: 5607 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 5608 - Posted: 6 Nov 2018, 17:41:38 UTC - in response to Message 5606.  

The normal way to end a child process would be to send a TERM signal first (which can be trapped/ignored by the child) and to send a KILL signal after a grace period.
The latter can't be trapped and tells the kernel to immediately destroy the affected process.
As I doubt there is a general kernel problem and due to the fact that the cancellation is initiated by the client I still think it's a BOINC client issue.

May be that the client does only send the TERM signal.
One way to check this would be to read the source code.

Retested the allocated and not freed memory after tasks aborted by the server.
First 14 cores available, but meanwhile all memory used, so 12 tasks running 100% and 2 tasks 0% cpu cause not enougn memory.
A lot of aborted tasks not showing up in BOINC Manager, but have not freed memory
top - 18:10:18 up  4:58,  1 user,  load average: 11.94, 12.00, 12.41
Tasks: 183 total,  13 running, 168 sleeping,   0 stopped,   2 zombie
Cpu(s):  0.0%us,  0.1%sy, 85.8%ni, 14.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   9725296k total,  9350200k used,   375096k free,    13992k buffers
Swap:  2097148k total,  2095516k used,     1632k free,   171360k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5686 boinc     39  19  446m 330m 1996 R  100  3.5   5:11.88 sixtrack_lin64_
 5965 boinc     39  19  446m 330m 1996 R  100  3.5   0:33.89 sixtrack_lin64_
 5968 boinc     39  19  446m 330m 1992 R  100  3.5   0:31.76 sixtrack_lin64_
 5583 boinc     39  19  446m 330m 1996 R  100  3.5   6:52.98 sixtrack_lin64_
 5600 boinc     39  19  446m 330m 1996 R  100  3.5   6:36.72 sixtrack_lin64_
 5607 boinc     39  19  446m 330m 1992 R  100  3.5   6:30.59 sixtrack_lin64_
 5642 boinc     39  19  446m 330m 1996 R  100  3.5   5:55.95 sixtrack_lin64_
 5925 boinc     39  19  446m 330m 1996 R  100  3.5   1:10.54 sixtrack_lin64_
 5931 boinc     39  19  446m 330m 1996 R  100  3.5   1:05.43 sixtrack_lin64_
 5950 boinc     39  19  446m 330m 1992 R  100  3.5   0:48.11 sixtrack_lin64_
 5976 boinc     39  19  446m 330m 1992 R  100  3.5   0:24.64 sixtrack_lin64_
 5935 boinc     39  19  446m 330m 1992 R  100  3.5   1:02.33 sixtrack_lin64_
 1235 boinc     30  10  117m  11m 2552 S    1  0.1   1:19.89 boinc
 3331 boinc     39  19  510m 1908 1900 S    0  0.0   0:55.75 sixtrack_lin64_
 3335 boinc     39  19  510m  58m 1900 S    0  0.6   0:50.66 sixtrack_lin64_
 3340 boinc     39  19  510m 1908 1900 S    0  0.0   0:32.39 sixtrack_lin64_
 3344 boinc     39  19  510m  11m 1252 S    0  0.1   0:03.28 sixtrack_lin64_
 3360 boinc     39  19  510m 1676 1376 S    0  0.0   0:10.49 sixtrack_lin64_
 3368 boinc     39  19  510m 2580 1712 S    0  0.0   0:16.48 sixtrack_lin64_
 3374 boinc     39  19  510m 330m 1832 S    0  3.5   0:18.43 sixtrack_lin64_
 3380 boinc     39  19  510m 192m 1252 S    0  2.0   0:09.22 sixtrack_lin64_
 3384 boinc     39  19  510m 320m 1376 S    0  3.4   0:12.37 sixtrack_lin64_
 3389 boinc     39  19  510m 322m 1376 S    0  3.4   0:13.36 sixtrack_lin64_
 3393 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.32 sixtrack_lin64_
 3405 boinc     39  19  510m 330m 1900 S    0  3.5   0:17.43 sixtrack_lin64_
 3416 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.37 sixtrack_lin64_
 3421 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.19 sixtrack_lin64_
 3427 boinc     39  19  510m 324m 1372 S    0  3.4   0:12.34 sixtrack_lin64_
 3431 boinc     39  19  510m 319m 1004 S    0  3.4   0:02.05 sixtrack_lin64_
 3435 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.37 sixtrack_lin64_
 3440 boinc     39  19  510m 322m 1252 S    0  3.4   0:04.14 sixtrack_lin64_
 3452 boinc     39  19  510m 319m 1004 S    0  3.4   0:02.14 sixtrack_lin64_
 3458 boinc     39  19  510m 330m 1904 S    0  3.5   0:45.25 sixtrack_lin64_
 5070 boinc     39  19  510m 320m 1380 S    0  3.4   0:01.24 sixtrack_lin64_
 6001 boinc     39  19     0    0    0 Z    0  0.0   0:00.00 sixtrack_lin64_ <defunct>
 6002 boinc     39  19     0    0    0 Z    0  0.0   0:00.00 sixtrack_lin64_ <defunct>

Second table shows all processes owned by boinc, but all tasks are ready, uploaded and reported,
so no single task in BOINC Manager is shown.
top - 18:22:12 up  5:10,  1 user,  load average: 1.08, 5.20, 9.09
Tasks: 169 total,   1 running, 168 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   9725296k total,  5161796k used,  4563500k free,    17112k buffers
Swap:  2097148k total,  2094268k used,     2880k free,    22144k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1235 boinc     30  10  117m  11m 2440 S    0  0.1   1:23.49 boinc
 3331 boinc     39  19  510m 1908 1900 S    0  0.0   0:55.76 sixtrack_lin64_
 3335 boinc     39  19  510m  58m 1900 S    0  0.6   0:50.68 sixtrack_lin64_
 3340 boinc     39  19  510m 1908 1900 S    0  0.0   0:32.39 sixtrack_lin64_
 3344 boinc     39  19  510m  11m 1252 S    0  0.1   0:03.28 sixtrack_lin64_
 3360 boinc     39  19  510m 1676 1376 S    0  0.0   0:10.50 sixtrack_lin64_
 3368 boinc     39  19  510m 2580 1712 S    0  0.0   0:16.49 sixtrack_lin64_
 3374 boinc     39  19  510m 330m 1832 S    0  3.5   0:18.44 sixtrack_lin64_
 3380 boinc     39  19  510m 192m 1252 S    0  2.0   0:09.22 sixtrack_lin64_
 3384 boinc     39  19  510m 320m 1376 S    0  3.4   0:12.38 sixtrack_lin64_
 3389 boinc     39  19  510m 322m 1376 S    0  3.4   0:13.38 sixtrack_lin64_
 3393 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.35 sixtrack_lin64_
 3405 boinc     39  19  510m 330m 1900 S    0  3.5   0:17.44 sixtrack_lin64_
 3416 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.38 sixtrack_lin64_
 3421 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.20 sixtrack_lin64_
 3427 boinc     39  19  510m 324m 1372 S    0  3.4   0:12.35 sixtrack_lin64_
 3431 boinc     39  19  510m 319m 1004 S    0  3.4   0:02.06 sixtrack_lin64_
 3435 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.38 sixtrack_lin64_
 3440 boinc     39  19  510m 322m 1252 S    0  3.4   0:04.16 sixtrack_lin64_
 3452 boinc     39  19  510m 319m 1004 S    0  3.4   0:02.15 sixtrack_lin64_
 3458 boinc     39  19  510m 330m 1904 S    0  3.5   0:45.25 sixtrack_lin64_
 5070 boinc     39  19  510m 320m 1380 S    0  3.4   0:01.26 sixtrack_lin64_

Third table shows all processes after I have stopped BOINC client with 'sudo /etc/init.d/boinc-client stop'
top - 18:24:40 up  5:12,  1 user,  load average: 0.10, 3.20, 7.77
Tasks: 168 total,   1 running, 167 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   9725296k total,  5161456k used,  4563840k free,    18392k buffers
Swap:  2097148k total,  2091792k used,     5356k free,    28808k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3331 boinc     39  19  510m 1912 1900 S    0  0.0   0:55.77 sixtrack_lin64_
 3335 boinc     39  19  510m  58m 1900 S    0  0.6   0:50.68 sixtrack_lin64_
 3340 boinc     39  19  510m 1912 1900 S    0  0.0   0:32.40 sixtrack_lin64_
 3344 boinc     39  19  510m  11m 1252 S    0  0.1   0:03.29 sixtrack_lin64_
 3360 boinc     39  19  510m 1676 1376 S    0  0.0   0:10.51 sixtrack_lin64_
 3368 boinc     39  19  510m 2580 1712 S    0  0.0   0:16.49 sixtrack_lin64_
 3374 boinc     39  19  510m 330m 1832 S    0  3.5   0:18.44 sixtrack_lin64_
 3380 boinc     39  19  510m 192m 1252 S    0  2.0   0:09.22 sixtrack_lin64_
 3384 boinc     39  19  510m 320m 1376 S    0  3.4   0:12.39 sixtrack_lin64_
 3389 boinc     39  19  510m 322m 1376 S    0  3.4   0:13.39 sixtrack_lin64_
 3393 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.35 sixtrack_lin64_
 3405 boinc     39  19  510m 330m 1900 S    0  3.5   0:17.44 sixtrack_lin64_
 3416 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.38 sixtrack_lin64_
 3421 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.20 sixtrack_lin64_
 3427 boinc     39  19  510m 324m 1372 S    0  3.4   0:12.35 sixtrack_lin64_
 3431 boinc     39  19  510m 319m 1004 S    0  3.4   0:02.07 sixtrack_lin64_
 3435 boinc     39  19  510m 324m 1376 S    0  3.4   0:12.38 sixtrack_lin64_
 3440 boinc     39  19  510m 322m 1252 S    0  3.4   0:04.16 sixtrack_lin64_
 3452 boinc     39  19  510m 319m 1004 S    0  3.4   0:02.15 sixtrack_lin64_
 3458 boinc     39  19  510m 330m 1904 S    0  3.5   0:45.25 sixtrack_lin64_
 5070 boinc     39  19  510m 320m 1380 S    0  3.4   0:01.27 sixtrack_lin64_
ID: 5608 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
captainjack

Send message
Joined: 18 Aug 15
Posts: 14
Credit: 117,668
RAC: 1,115
Message 5609 - Posted: 6 Nov 2018, 19:01:27 UTC

Me too, I also have 45 aborted tasks that are each taking up 328MiB of memory. All of 16GB memory is currently being used and 1 GB of the SWAP file is being used. System currently has no active sixtrack test tasks showing in BOINC. Time for a reboot and stop running sixtrack test tasks.

By the way, does anybody know what we are testing with all the sixtrack tasks? Would love to know if we are supposed to be watching for anything specific.
ID: 5609 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 5610 - Posted: 6 Nov 2018, 19:33:30 UTC

I also found one of them on my test system although I started the last task yesterday morning.
The task itself is inactive as well as the BOINC client but it still "lives" in RAM and in the slot folder.

This line is from it's init_data.xml
<result_name>Sixtrack_1538806_1540999193.697062_9302_1</result_name>

The client_state.xml does not contain a corresponding ID but it can be found in my task list:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2488109

The same WU has been sent a second time to the same host:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2488108



Another host (CP's Opteron) also got the same WU twice:
https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=1743356


As multiple sends to the same host are not very common I wonder if we stumbled over a BOINC client bug.
ID: 5610 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 5611 - Posted: 6 Nov 2018, 19:55:52 UTC - in response to Message 5610.  
Last modified: 6 Nov 2018, 20:00:53 UTC

As multiple sends to the same host are not very common I wonder if we stumbled over a BOINC client bug.

No bug on either client nor server side.
It's a setting in the server configuration table to avoid sending the same task to the same host and/or the same user.
Settings:
<one_result_per_user_per_wu/>
<one_result_per_host_per_wu/>
ID: 5611 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 5612 - Posted: 6 Nov 2018, 20:23:30 UTC - in response to Message 5611.  

No bug on either client nor server side.
It's a setting in the server configuration table to avoid sending the same task to the same host and/or the same user.
Settings:
<one_result_per_user_per_wu/>
<one_result_per_host_per_wu/>

Those server options are usually used to ensure that results from one user/host can be verified against another user/host.
If a result is sent twice to the same host this must be handled by the client without a crash.
Nonetheless there are obviously lots of crashes.

What I suspect is that those multiple sends may be the reason for the crashes.
ID: 5612 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 20 Jun 17
Posts: 25
Credit: 2,940,586
RAC: 2,287
Message 5613 - Posted: 7 Nov 2018, 2:06:02 UTC
Last modified: 7 Nov 2018, 2:06:43 UTC

Ya mean like how I mentioned it before in this post? Not a word was said about it then.
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=415

Now nearly everything is being canceled by the server. This was working fine before the server went down.
ID: 5613 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,327
RAC: 2,947
Message 5614 - Posted: 7 Nov 2018, 6:50:31 UTC - in response to Message 5613.  

Now nearly everything is being canceled by the server. This was working fine before the server went down.

18/11/5 8:30 UTC is the beginning of the Problems with Server cancelled tasks.
The time before, all tasks are running without those errors.
Is there a Server-status seen?
ID: 5614 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Sixtrack Application : The Sixtrack Application


©2024 CERN