Hi,

I have patch for freezes on test #7 in SMP mode. The bug can (and probably will) occur if you have less than 8 CPU threads and more than (something around) 5 G of RAM.

Bug is directly inside test #7 code (test.c: block_move()). Function calculate_chunk(&start, &end,...) returns address of the first (start) and the last (end) word to be tested by this cpu. The variable end is incremented so it points just after the tested memory. After that, this big block is divided into one or more smaller blocks of size <= 256MB. But this won't happen if integer overflow occurs during the incrementation.

So this code may leads to different count of blocks for the last cpu (the one with the highest memory address) and all other. Different count of blocks means also different count of calls of function do_tick(me); and therefore different count of calls of s_barrier();. Than the deadlock is inevitable.

The fix is quite simple:

Code:
--- a/test.c
+++ b/test.c
@@ -1202,7 +1202,7 @@ void block_move(int iter, int me)
             } else {
                 pe = end;
             }
-            if (pe >= end) {
+            if ((pe >= end && end != 0) || (pe < p && end == 0)) {
                 pe = end;
                 done++;
             }
@@ -1280,7 +1280,7 @@ void block_move(int iter, int me)
             } else {
                 pe = end;
             }
-            if (pe >= end) {
+            if ((pe >= end && end != 0) || (pe < p && end == 0)) {
                 pe = end;
                 done++;
             }
@@ -1359,7 +1359,7 @@ void block_move(int iter, int me)
             } else {
                 pe = end;
             }
-            if (pe >= end) {
+            if ((pe >= end && end != 0) || (pe < p && end == 0)) {
                 pe = end;
                 done++;
             }
Beware: Check that all three chunks change the function block_move(). Same code is also in the other tests, but they do not need to be patched because there is no incrementation of variable end. This is especially important if you have applied other patches and line numbers are not exactly the same.