Canardpc.com

Précédent   Canardpc.com > English Boards > Memtest86+ Official forum

Réponse
 
Outils de la discussion
Vieux 15/01/2009, 06h38   #1
ChipProgrammer
Noobz0r
Par défaut Memtest86+ v2.11a

I presently work at one of the memory manufacturers, writing memory testing programs. One of our production sites wrote to me some while back that MemTest86+ v2.01 was showing DDR2 800 memory in an AMD based system as running at 441 MHz. Knowing that the AMD processor cannot run DDR2-800 memory at such speeds under normal conditions, I had a look at controller.c to find out why this program was showing wrong speeds for AMD. When I fixed the code, I tried to make the corrected controller.c available for inclusion in future versions, but I was not able to contact Chris Brady at the email addresses I could find.

Well, neither Memtest86 nor Memtest86+ have my changes, so I installed Ubuntu 7.04 this week on some systems so I can build the executables. At the same time, I needed to fix some other problems with these programs. One problem is the limitation of 64 GB, yet PAE mode supports 52-bits of address space. The other problem that was of more immediate concern is the testing of E820 type 3 (ACPI data) memory regions, which can cause false errors with some BIOSes.

While I was in the code, I took the liberty of --
  1. fixing some formatting issues in the source itself,
  2. made it build mt86+_loader from the Makefile,
  3. added support for more than the base PCI Bus for Intel Nehalem controllers,
  4. added support for Intel Clarksfield and Auburndale memory controllers,
  5. fixed the CPU family and model values, and the identification of processors based on these updated values,
  6. identified AMD Rev10 and Rev11 processors/memory controllers correctly,
  7. removing the 64 GB limitation, so we can test our servers with 128 GB or more,
  8. and probably more changes than I can remember at the moment.
The full source with my updates can be found here. Although I did not upload the executables, or ISO images, I can if needed.

There are still issues that I have not chased down yet. For one, the code does not run on an Intel Eaglelake customer reference board. Something in the transition from 32-bit protected mode to 16-bit real mode (for E820 support) is broken in all versions of Memtest86/+ that I have been able to test. A second issue is the code does not appear to determine the actual front side bus speed versus the expected FSB, which effects the memory clock computations.

Memtest86/+ needs a major overhaul if it wants to be taken seriously. It lacks support for multiple memory controllers, it does not decode addresses to DIMM/side/row/column/bank/DQ, and it does not have multithread support needed to retrieve error registers from all controllers. Some side benefits of multithread support would be an increase in test speed, and increase in coverage because memory would be stressed differently and more frequently.

--
Chip
ChipProgrammer est déconnecté   Réponse avec citation
Vieux 17/01/2009, 03h25   #2
Doc TB
Teraboule Technology
 
Avatar de Doc TB
Great patch. I will review the code and add all of this in 2.12.

Some remarks :

- About Mt86+ not supporting multiple memory controllers : All NUMA-based CPU architectures can be tested without problems. Of course, ECC error reporting will not be available as long as code remains mono-threaded.

- About Memory decode to DIMM/side/row/..etc. : That would be a great feature, but due to hardware interleaving, it's nearly impossible to know exactly where is physically located a memory address. And that will require a complete rewrite each time chipsets evolves.

- About multithread, my ASM skills are just not good enough to code that. Multithreading with plain ASM everywhere and basic C architecture sounds like hell for me. To increase coverage, I'm now focusing on x86-64 support.

About Clarksfield and Auburndale, if you got the value in the RS Bios Writer's Guide, you should be under Intel NDA and those informations should still be confidential. That's why I didn't added them on the last revision. But I can now publish the update
__________________

"Tout fil coupé à la bonne longueur se révélera trop court"
Doc TB est déconnecté   Réponse avec citation
Vieux 17/01/2009, 19h10   #3
ChipProgrammer
Noobz0r
DISCLAIMER: All comments I make to this forum are my own, and do not reflect the opinions of ANYONE else, or any entity real or otherwise.

Citation:
Envoyé par Doc TB Voir le message
Great patch. I will review the code and add all of this in 2.12.

Some remarks :

- About Mt86+ not supporting multiple memory controllers : All NUMA-based CPU architectures can be tested without problems. Of course, ECC error reporting will not be available as long as code remains mono-threaded.
Exactly. ECC errors are not in the Intel Nehalem memory controller where they belong, instead these are in CPU MSRs.

Citation:
- About Memory decode to DIMM/side/row/..etc. : That would be a great feature, but due to hardware interleaving, it's nearly impossible to know exactly where is physically located a memory address. And that will require a complete rewrite each time chipsets evolves.
Not a rewrite, but individual chipset error handling routines. Hardware interleaving is not a huge hurdle to overcome, but it does require some reading between the lines, and intuitive investigations to get it right. Having the NDA documents will help, for early development, assuming the chipset vendors do not provide the code themselves.

A good part of my chipset effort goes to decode what does not get put into documentation. AMD writes mostly great documentation for address decoding, where Intel writes mediocre decoding information, and Nvidia is completely missing (their's is the worst of all).

Citation:
- About multithread, my ASM skills are just not good enough to code that. Multithreading with plain ASM everywhere and basic C architecture sounds like hell for me. To increase coverage, I'm now focusing on x86-64 support.
64-bit long mode programming should not directly increase coverage, since some errors can only be found with 32-bit uncached accesses. What long mode gives you is more general purpose registers to contain variables without using the stack as much as 32-bit mode. Writing long mode code means that you will have to maintain two separate sources, for maximum effect/benefit, unless you are going to stop support for older processors.

The code I wrote for my job is a mixture of Asm, C, and C++. The C++ aspect is not necessary for multithreading, I have it there so that I can halt the auxiliary processors (APs) should my code exit abnormally. MT86+ does not need to have C++ class constructors and destructors to support multithreading, this can be written with standard C and Asm code.

One limitation that will be imposed will be that MT86+ cannot be relocatable because the APs will be running the same code as the boot strap processor (BSP). It becomes a trade-off, give up testing conventional memory for increased speed, coverage, and error handling.

Citation:
About Clarksfield and Auburndale, if you got the value in the RS Bios Writer's Guide, you should be under Intel NDA and those informations should still be confidential. That's why I didn't added them on the last revision. But I can now publish the update.
I do not have the RS Bios Writer's Guide for these two, I have PCI dumps taken from these chipsets by someone else. I compared these PCI dumps against what I have from Nehalem and X38/X48 chipsets. The earliest documentation I saw about Auburndale gave me the impression that it would be derived from Nehalem. Although I do not have personal physical access to Auburndale, the PCI dump I have seen shows this chipset is derived from the Broadwater series; Broadwater became known as i965, and continues as the 4 series. The only component of these two chipsets that I added were the device id values so the code can be tested with those chipsets.

My experience with the Broadwater series of chipsets led us to understand the CLKCFG register does not directly give the memory clock or the front side bus ( FSB ) clock rate. As noted in the MT86+ code, this register is used to derive a ratio based on the actual FSB. Since MT86+ does not compute actual FSB values, the memory clock shown for this series is not necessarily correct, especially when the FSB is not set to default value. For example, an Asus P5AD2 (Alderwood, aka i925) and newer boards can change the FSB so that the chipset will generate a different memory clock than what the chipset is designed to support. The CLKCFG register can be used as a ratio for most, if not all i9xx chipsets, and all 4 series. From the PCI dump I have seen, Auburndale does not appear use this register, so the memory clock could be derived from the refresh counter register; divide the refresh counter value by 7.8 to get the base memory clock, and multiply that by 2 to get the DDR/2/3 speed value.

I put up binaries, in case you want to test them:

Precompiled bin in .gz format
Precompiled bin in .zip format

CD Image (iso) in .gz format
CD Image (iso) in .zip format

Floppy installable in .zip format

DOS executable in .zip format
ChipProgrammer est déconnecté   Réponse avec citation
Vieux 19/01/2009, 18h05   #4
Brama
Noobz0r
 
Avatar de Brama
Ville: Milano - Italy
As overclocker I wish the program in the future can be multithread.
To test overclock of memories in modern platforms, you need absolutely a multithread program as I made many experiences that my overclock was rock solid under memtest86+ and failed immediately under XP, Vista and Windows 7.

For us overclockers, would be really useful.

Why don't thinking to a collaboration between you?

Thanks,

Dernière modification par Brama ; 19/01/2009 à 18h13..
Brama est déconnecté   Réponse avec citation
Vieux 20/01/2009, 16h46   #5
Doc TB
Teraboule Technology
 
Avatar de Doc TB
ChipProgrammer> Is your code gcc-4.3 ready ?
__________________

"Tout fil coupé à la bonne longueur se révélera trop court"
Doc TB est déconnecté   Réponse avec citation
Vieux 22/01/2009, 09h46   #6
ChipProgrammer
Noobz0r
It is not, I do not have v4.3 installed yet.
ChipProgrammer est déconnecté   Réponse avec citation
Réponse

  Canardpc.com > English Boards > Memtest86+ Official forum


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non

Navigation rapide


Fuseau horaire GMT +2. Il est actuellement 14h11.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Version française #21 par l'association vBulletin francophone
Canard PC / Presse Non Stop