I made the mistake of buying some NVRAM cards off ebay because the price was right. They appear to be from a EMC SAN. According to the listing they're both Curtiss Wright 5453J and there are some markings on the PCB that confirm this. It's a battery backed DRAM card with 1GB of usable space, it is supposed to appear as a block device to the operating system and is basically a small NVME SSD that wipes its storage if it loses power for more than a day. In the past they were mostly used for external journals for files systems before SSD storage became affordable.
The problem I'm encountering is when I tried booting up into kernel 4.4.0 I received an error from the umem driver that the magic number didn't match. I looked into this error and found that someone made a patch for ubuntu 14.04 running kernel 3.13. So I installed ubuntu 14.04.5 and patched the umem.ko driver and now kernel 3.13 tries to initalize the devices but I receive memory access errors.
This driver looks like it was last patched in 2002 according to the source files so I'm guessing there was a compatibility problem between kernel 2.6 and later kernels that wasn't fixed. It also was written for PCI and PCI-X (maybe older interfaces too) so I would think it should work using PCIE.
this is the patch I applied originally from https://patchwork.kernel.org/patch/8180661/
Code: Select all
Comments
Dr. Net! - Eugen Rieck - Feb. 1, 2016, 3:33 p.m.
From: Eugen Rieck <eugen@drnet.at>
Allow for multiple magic numbers per model and include the magic number
for the PCIe-based 5453CN, that shares the device id of the older
PCI-X-based 5425CN.
Signed-off-by: Eugen Rieck <eugen@drnet.at>
---
The umem driver for Micro Memory (now Curtiss-Wright) NVRAM cards relied
on a unique magic number for each PCI device id. Newer cards violate this
and were thus rejected by the driver.
Developed for 3.13.0 (Ubuntu 14.04, ubuntu 3.13.0-76-generic), but merges
cleanly into current (4.5-rc2) as the driver is old and largely unchanged.
umem.c | 29 +++++++++++++++++++++++------
umem.h | 2 ++
2 files changed, 25 insertions(+), 6 deletions(-)
Patch
--- drivers/block/umem.c.orig 2016-02-01 14:51:07.463944676 +0100
+++ drivers/block/umem.c 2016-02-01 15:03:52.815934423 +0100
@@ -32,6 +32,8 @@
* 17May2002:NeilBrown - remove init_mem initialisation. Instead detect
* - a sequence of writes that cover the card, and
* - set initialised bit then.
+ * 01Feb2016:Eugen Rieck - allow multiple magic numbers per pci id to
+ * acommodate newer models
*/
#undef DEBUG /* #define DEBUG if you want debugging info (pr_debug) */
@@ -799,6 +801,9 @@
unsigned long csr_base;
unsigned long csr_len;
int magic_number;
+ int magic_numbers[MAGIC_NUMBERS_PER_DEV];
+ int magic_number_ok = 0;
+ int i;
static int printed_version;
if (!printed_version++)
@@ -850,27 +855,39 @@
switch (card->dev->device) {
case 0x5415:
card->flags |= UM_FLAG_NO_BYTE_STATUS | UM_FLAG_NO_BATTREG;
- magic_number = 0x59;
+ magic_numbers[0] = 0x59;
+ magic_numbers[1] = 0x100;
break;
case 0x5425:
card->flags |= UM_FLAG_NO_BYTE_STATUS;
- magic_number = 0x5C;
+ magic_numbers[0] = 0x5C;
+ magic_numbers[1] = 0x5E;
+ magic_numbers[2] = 0x100;
break;
case 0x6155:
card->flags |= UM_FLAG_NO_BYTE_STATUS |
UM_FLAG_NO_BATTREG | UM_FLAG_NO_BATT;
- magic_number = 0x99;
+ magic_numbers[0] = 0x99;
+ magic_numbers[1] = 0x100;
break;
default:
- magic_number = 0x100;
+ magic_numbers[0] = 0x100;
break;
}
- if (readb(card->csr_remap + MEMCTRLSTATUS_MAGIC) != magic_number) {
- dev_printk(KERN_ERR, &card->dev->dev, "Magic number invalid\n");
+ magic_number = readb(card->csr_remap + MEMCTRLSTATUS_MAGIC);
+ for (i = 0; i < MAGIC_NUMBERS_PER_DEV; i++) {
+ if (magic_numbers[i] == magic_number) {
+ magic_number_ok = 1;
+ break;
+ }
+ if (magic_numbers[i] >= 0x100) break;
+ }
+ if (!magic_number_ok) {
+ dev_printk(KERN_ERR, &card->dev->dev, "Magic number 0x%02x invalid for device 0x%04x\n", magic_number, card->dev->device);
ret = -ENOMEM;
goto failed_magic;
}
--- drivers/block/umem.h.orig 2016-02-01 14:51:16.055944561 +0100
+++ drivers/block/umem.h 2016-02-01 14:47:38.463947477 +0100
@@ -15,6 +15,8 @@
#define IRQ_TIMEOUT (1 * HZ)
+#define MAGIC_NUMBERS_PER_DEV 2
+
/* CSR register definition */
#define MEMCTRLSTATUS_MAGIC 0x00
#define MM_MAGIC_VALUE (unsigned char)0x59
lsblk reports the following
Code: Select all
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
└─sda2 8:2 0 19.5G 0 part /
sdb 8:16 0 524.5G 0 disk
└─sdb1 8:17 0 524.5G 0 part /usr
umema 251:0 0 1G 0 disk
umemb 251:64 0 1G 0 disk
Code: Select all
root@ubuntu-test:/home/cscott# dmesg | grep umem
[ 5.409505] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409509] umem 0000:05:00.0: I/O error on sector 16/4096
[ 5.409511] umem 0000:05:00.0: DMAstat -
[ 5.409552] umem 0000:05:00.0: DMAstat -
[ 5.409555] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409556] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409558] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 5.409560] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409564] umem 0000:05:00.0: I/O error on sector 0/4096
[ 5.409565] umem 0000:05:00.0: DMAstat -
[ 5.409604] umem 0000:05:00.0: DMAstat -
[ 5.409607] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409608] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409610] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 5.409612] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409615] umem 0000:05:00.0: I/O error on sector 0/4096
[ 5.409616] umem 0000:05:00.0: DMAstat -
[ 5.409659] umem 0000:05:00.0: DMAstat -
[ 5.409662] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409663] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409665] umem 0000:05:00.0: Fault Address 0x0000001f00, Fault Data 0x116c30a4425832b1
[ 5.409667] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409670] umem 0000:05:00.0: I/O error on sector 8/4096
[ 5.409671] umem 0000:05:00.0: DMAstat -
[ 5.409712] umem 0000:05:00.0: DMAstat -
[ 5.409715] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409716] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409718] umem 0000:05:00.0: Fault Address 0x0000002f00, Fault Data 0x10f4104798a000f0
[ 5.409719] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409729] umem 0000:05:00.0: I/O error on sector 16/4096
[ 5.409731] umem 0000:05:00.0: DMAstat -
[ 5.409781] umem 0000:05:00.0: DMAstat -
[ 5.409784] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409785] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409787] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 5.409789] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409792] umem 0000:05:00.0: I/O error on sector 0/4096
[ 5.409794] umem 0000:05:00.0: DMAstat -
[ 5.409833] umem 0000:05:00.0: DMAstat -
[ 5.409836] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409838] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409840] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 5.409841] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409845] umem 0000:05:00.0: I/O error on sector 0/4096
[ 5.409846] umem 0000:05:00.0: DMAstat -
[ 5.409888] umem 0000:05:00.0: DMAstat -
[ 5.409891] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 5.409892] umem 0000:05:00.0: Multi-bit EDC error
[ 5.409895] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 5.409896] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 5.409900] umem 0000:05:00.0: I/O error on sector 0/4096
[ 5.409901] umem 0000:05:00.0: DMAstat -
[ 5.409939] umem 0000:05:00.0: DMAstat -
and so on and so forth
[ 21.878134] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878136] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878140] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878142] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878148] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878150] umem 0000:05:00.0: DMAstat -
[ 21.878195] umem 0000:05:00.0: DMAstat -
[ 21.878199] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878201] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878205] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878207] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878213] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878215] umem 0000:05:00.0: DMAstat -
[ 21.878262] umem 0000:05:00.0: DMAstat -
[ 21.878266] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878268] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878272] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878275] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878280] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878282] umem 0000:05:00.0: DMAstat -
[ 21.878327] umem 0000:05:00.0: DMAstat -
[ 21.878332] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878334] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878337] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878340] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878345] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878347] umem 0000:05:00.0: DMAstat -
[ 21.878396] umem 0000:05:00.0: DMAstat -
[ 21.878400] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878402] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878406] umem 0000:05:00.0: Fault Address 0x0000003f00, Fault Data 0x620c20a3b81580a0
[ 21.878409] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878414] umem 0000:05:00.0: I/O error on sector 24/4096
[ 21.878416] umem 0000:05:00.0: DMAstat -
[ 21.878462] umem 0000:05:00.0: DMAstat -
[ 21.878467] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878469] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878473] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878475] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878481] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878483] umem 0000:05:00.0: DMAstat -
[ 21.878528] umem 0000:05:00.0: DMAstat -
[ 21.878532] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878535] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878538] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878541] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878545] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878548] umem 0000:05:00.0: DMAstat -
[ 21.878593] umem 0000:05:00.0: DMAstat -
[ 21.878598] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878600] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878604] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878607] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878612] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878614] umem 0000:05:00.0: DMAstat -
[ 21.878658] umem 0000:05:00.0: DMAstat -
[ 21.878662] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878665] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878668] umem 0000:05:00.0: Fault Address 0x0000007f00, Fault Data 0xe5754bb094460c14
[ 21.878671] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878676] umem 0000:05:00.0: I/O error on sector 56/4096
[ 21.878678] umem 0000:05:00.0: DMAstat -
[ 21.878723] umem 0000:05:00.0: DMAstat -
[ 21.878728] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878730] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878734] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878737] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878742] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878744] umem 0000:05:00.0: DMAstat -
[ 21.878788] umem 0000:05:00.0: DMAstat -
[ 21.878792] umem 0000:05:00.0: Memory access error detected (err count 0)
[ 21.878795] umem 0000:05:00.0: Multi-bit EDC error
[ 21.878798] umem 0000:05:00.0: Fault Address 0x0000000f00, Fault Data 0xf6ee04e64a340280
[ 21.878800] umem 0000:05:00.0: Fault Check 0x00, Fault Syndrome 0x00
[ 21.878805] umem 0000:05:00.0: I/O error on sector 0/4096
[ 21.878807] umem 0000:05:00.0: DMAstat -
Right now they are sharing IRQs with other devices I'm not sure if that matters but in the bios I have them mapped to IRQs that have some USB peripherals too.
and some more debug info
output of
Code: Select all
lscpi -nnk
Code: Select all
04:00.0 Memory controller [0580]: Micro Memory MM-5425CN PCI 64/66 Memory Module with Battery Backup [1332:5425] (rev 08)
Subsystem: Micro Memory Device [1332:5453]
Kernel driver in use: umem
05:00.0 Memory controller [0580]: Micro Memory MM-5425CN PCI 64/66 Memory Module with Battery Backup [1332:5425] (rev 08)
Subsystem: Micro Memory Device [1332:5453]
Code: Select all
root@ubuntu-test:/# uname -a
Linux ubuntu-test 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux