Archive for July, 2017

EEPROM Wear Leveling

Wednesday, July 5th, 2017 Nigel Jones

A problem that occurs from time to time is the need for an embedded system to keep track of the number of doodads it has processed since it was put in to service. Clearly you can’t keep this value in RAM as it will be lost when power is removed. Thus the obvious solution is to store the count in EEPROM. However, common EEPROM  specifications are 10^5 guaranteed write cycles, with 10^6 typical write cycles. If you expect to process say 20 million doodads during the life of your product you clearly can’t do something like this:

__eeprom uint32_t doodad_cntr;
...
doodad_cntr++;

The above assumes that the __eeprom memory space qualifier is sufficient for the compiler to generate the requisite code sequences to access EEPROM.

The first part of the solution is to define an array in EEPROM for holding the doodad counter. Thus:

__eeprom uint32_t doodad_cntr[DOODAD_ARRAY_SIZE];

Since the number of doodads processed increases monotonically, on power up one simply searches the array looking for the last entry, – i.e. the entry in the array with the biggest value. One records this offset into a static variable. Thus:

static uint8_t doodad_array_offset;
doodad_array_offset = find_last_used_entry();

The next time a doodad is processed, then the write occurs at the next location beyond doodad_array_offset in doodad_cntr[]. This simple technique immediately gives one a guaranteed increase of DOODAD_ARRAY_SIZE more writes than using a single variable. Whilst this is nice, there are a number of other things that one can do to improve the situation. These are all based on the observation that the erased state of EEPROM is a ‘1’ and not a zero. Thus a ‘blank’ EEPROM actually consists of 0xFF …. 0xFF, without a zero in sight. To take advantage of this, rather than writing the actual doodad count value to the array, instead write the 1’s complement of the value. This means that rather than writing, for example, the value 0x00000001 to EEPROM, you’ll instead write 0xFFFFFFFE. In this case the actual number of bits that have to change state is just 1 rather than 31, resulting in considerably less stress on the EEPROM, potentially increasing the life of the EEPROM. Note that this technique is equivalent to initializing the EEPROM to 0xFFFFFFFF and then decrementing.

Writing the 1’s complement also opens up another potential improvement. EEPROM is often byte or word addressable. Furthermore, to program the EEPROM is usually a 2-step process consisting of erasing the memory location (i.e. erase it to 0xFF) and then programming the erased location with the desired value. Using 1’s complement values, it should be apparent that for most of the time, many of the bytes in the EEPROM will be at 0xFF. If we combine this with the fact that most of the time only one out of the four bytes in a uint32_t will change value when a uint32_t value is incremented, then we can dramatically minimize the actual number of erases / writes performed on the EEPROM. The code to do this looks a bit like this:

typedef union {
    uint8_t cnt_bytes[4U];
    uint32_t counter;
} eeprom_union_t;
__eeprom eeprom_union_t doodad_cntr[DOODAD_ARRAY_SIZE];

void increment_doodad(void) {
    eeprom_union_t current_value.counter = doodad_cntr[doodad_array_offset].counter;
    if (++doodad_array_offset >= DOODAD_ARRAY_SIZE) {
        doodad_array_offset = 0U;
    }
    --current_value.counter;  //1's complement increment
    for (uint8_t i = 0U; i < 4U; ++i) {
        //Are the bytes different? If no, then do nothing
        if (doodad_array[doodad_array_offset].cnt_bytes[i] != current_value.cnt_bytes[i]) {
            //Need to erase byte
            erase(doodad_array[doodad_array_offset].cnt_bytes[i]);
            //See if we need to actually program now
            if (current_value.cnt_bytes[i] != 0xFFU) {
                doodad_array[doodad_array_offset].cnt_bytes[i] = current_value.cnt_bytes[i];
            }
        }
    }
}

Clearly this is quite a bit more work. However given that EEPROM erase / write times are in the 2 – 10ms arena, the time taken in comparison to execute the above code is insignificant. Given that it will also save millions of EEPROM writes over the life of the product, then the complexity is well worth it.
Finally if your product needs to keep track of an enormous number of doodads, then you’ll likely have no choice other than to keep track of when EEPROM cells go bad. This is typically done by assigning another area of EEPROM that has  DOODAD_ARRAY_SIZE bits – e.g._eeprom uint8_t bad_cell[DOODAD_ARRAY_SIZE / sizeof(uint8_t)]. These bits are erased to 1. Once you detect that a write has failed at a certain cell in doodad_cntr[], then you change the corresponding bit in bad_cell[] from ‘1’ to a ‘0’ and the cell is considered bad for all time.  Obviously you then have to interrogate the bad_cell[] array to determine whether the code should use a specific cell.