257
Original scientific paper
 MIDEM Society
1 Introduction
Today Electronic Design Automation (EDA) industries 
aims to make reliability the next level of radiation pro-
tection by drawing on advances in fault tolerant tech-
niques to protect CMOS memory chips and promoting 
the protected memory chips to space and safety criti-
cal applications. SRAM memories are mainly utilized 
by reconfigurable devices like field reprogrammable 
Radiation Induced Multiple Bit Upset Prediction 
and Correction in Memories using Cost Efficient 
CMC
A.Ahilan1, P. Deepa2
1Fulltime research scholar, GCT Coimbatore
2Assistant Professor, GCT Coimbatore
Abstract: This paper presents a cost efficient technique to correct Multiple Bit Upsets (MBUs) to protect memories against radiation. 
To protect memories from MBUs, many complex error correction codes (ECCs) were used previously, but the major issue is higher 
redundant memory overhead. The proposed method called counter matrix code (CMC) utilizes combinational ones counter and parity 
generator with less redundant memory overhead. CMC based on error predictor predicts the exact number of upsets before the actual 
error detection and correction process. The proposed technique uses Encode-Compare for minimizing the cost and increase the 
speed of the decoding process. The results are compared to the well-known codes such as CRC, Hamming and other matrix codes. The 
obtained results show that the correction coverage per cost (CCC) of the proposed scheme is higher than other traditional techniques. 
The mean time to repair (MTTR) of the proposed scheme is 3 times reduced than Xilinx cyclic redundancy check (CRC) + Reload 
technique for 100% correction coverage. At the same time MTTR of the proposed scheme is 0.3 ms, 0.2 ms and 1.8 ms less than I3D, 
DMC and MC, respectively with improved correction coverage.
Keywords: Multiple bit upsets (MBUs); memories; ones counter; parity codes; mean time to repair (MTTR)
Napoved in korekcija s sevanjem povzročenih 
večbitnih napak v pomnilnikih z uporabo 
učinkovitih CMC kod
Izvleček: Članek predstavlja učinkovito metodo korekcije večbitih napak (MBU) za zaščito pomilnikov pred sevanjem. V preteklosti so 
se za zaščito pomnilnikov uporabljale številne kompleksne metode popravljanja napak, ki pa so zahtevale veliko spominskega prostora. 
Predlagana metoda CMC združuje števec in generator paritete z manjšo zahtevo po redundančnem spominu. CMC napove natančno 
število napak pred dejansko detekcijo in korekcijo. Rezultati so primerjani z ostalimi metodami kot so: CRC, Hamming in druge. 
Rezultati izkazujejo učinkovitejšo korekcijo kot konvencionalne metode, pri čemer je povprečen čas korekcije 3 krat krajši kot pri Xilinx 
CRC tehniki. Istočasno je MTTR 0.3 ms, 0.2 ms in 1.8 ms krajši od I3D, DMC in MC. 
Ključne besede: večbitne napake (MBUs); pomnilniki; pariteta ; števec; parity codes; povprečni čas korekcije (MTTR)
* Corresponding Author’s e-mail:  listentoahil@gmail.com
Journal of Microelectronics, 
Electronic Components and Materials
Vol. 46, No. 4(2016), 257 – 266
gate arrays (FPGAs) and recent programmable system 
on chips (SoCs). Recently the usages of SRAM memo-
ries are increased and occupied more than 90% of chip 
area in modern SoCs [1-3]. These SRAM memories are 
disturbed by soft errors and distresses system reliability 
and sustainability [4-5]. Minimum transistor size and in-
creased memory density due to technology scaling are 
becoming increasingly susceptible to multiple bit up-
258
sets (MBUs) [6]. The largest MBUs size observed in the 
neutron induced experiment is 24 bits [7]. For smaller 
nanometer technologies, this count of MBU size is even 
more [6]. This status evidently shows the significance 
of protecting SRAM memories against MBU incidents. 
Several proven techniques have been addressed to 
protect SRAM memories from radiation induced soft 
errors in FPGA configuration frames. Xilinx design flow 
consisting single event upset (SEU) mitigation step to 
cope single bit soft errors [18]. In addition to that Xilinx 
offers a two adjacent erroneous bits correction using 
IP block as a soft error alleviation controller based on 
global cyclic redundancy check (CRC) and error correc-
tion coding (ECC) technique [19]. The most common 
and efficient approach to preserve a good level of re-
liability for memory words is to use ECCs. The widely 
used ECC for memory protection is Hamming and odd 
weight codes against radiation induced soft errors due 
to their ability to mitigate single bit upsets (SBUs) prac-
tically with reduced energy and area overhead [8], [9]. 
On the other hand, single charged particle can provoke 
MBUs in the memory words and these MBUs are not 
corrected by these single bit correctable ECCs. Howev-
er, there are highly developed ECCs such as Reed–Solo-
mon codes [15], Reed–Muller code [10] and punctured 
difference set (PDS) codes [16] have been used to miti-
gate MBUs in memories. But the encoding and decod-
ing steps are more complex to cope with MBUs in these 
highly developed codes. More over this is achieved at 
the expense of high area, delay and power consump-
tion. 
In matrix code (MC) [11], two errors are corrected based 
on Hamming and vertical syndrome bits in all cases. Re-
cently DMC proposed by Jing Guo et.al to correct MBU 
with high reliability, but it uses more redundant bits. For 
32 bit memory word, 36 numbers of redundant bits are 
needed to correct MBU in DMC. This extra bits occupy 
more area in memory chip [12]. Parallel error correction 
code has been presented to correct MBU’s with huge 
area overhead [13]. More recently, in [14], 2-D ECCs 
such as 2-D SHMC (Symbolic Hamming Matrix Code) 
and 2-D RMC (Reconfigurable Matrix Code) has been 
proposed to efficiently mitigate MBUs of 32-bit memo-
ry word. The advantage of these codes is that the delay 
is minimized due to the Encode-Compare mechanism 
instead of Decode-Compare mechanism. In [22], an ap-
proach that combines interleaved 3-D parity technique 
(I3D) with erasure code has been conceived to be ap-
plied at architectural level. It uses horizontal, vertical 
and diagonal parity bits to detect MBUs and erasure 
codes for MBU correction. The results achieved from 
this approach shown that additional recovery time 
needed to correct MBUs over other codes. Based on the 
combinational ones counter and parity code, prelimi-
nary version of algorithm has been proposed for  MBU 
error prediction and error correction in SRAM [17]. 
In the proposed work, both intra and inter word er-
ror detection and correction and error prediction are 
introduced by combinational counting operation. The 
redundant bits used for the detection and correction 
are computed from the outputs of row and column 
counters. Computing redundant bits from group of 
words reduces the redundant memory overhead. This 
work uses Encode-Compare instead of Decode-Com-
pare mechanism in decoder for reducing the delay 
overhead. 
The presentation of this work can be divided into five 
sections. In section II, the proposed CMC is introduced 
and its encoder and decoder architectures are given 
with sample calculations. Section III discusses the cor-
rection coverage and overhead analysis of the various 
MBU mitigation methods. Conclusions and future work 
ideas are given in Section IV.
2 Proposed counter matrix code
In this section, CMC encoding and decoding algorithm 
is proposed to predict and correct the MBUs and the 
VLSI architectures for encoder and decoder are pre-
sented. The proposed CMC based encoding and de-
coding algorithm appears to lend itself to detect both 
Table 1: 128-bit logical organization of CMC
S.No Symbol8 Symbol7 Symbol6 Symbol5 Symbol4 Symbol3 Symbol2 Symbol1 HCC HPC
0 B0
(31-28) B0
(27-24) B0
(23-20) B0
(19-16) B0
(15-12) B0
(11-8) B0
(7-4) B0
(3-0) H0
(3-0) Hp0
 (3-0)
1 B1
(31-28) B1
(27-24) B1
(23-20) B1
(19-16) B1
(15-12) B1
(11-8) B1
(7-4) B1
(3-0) H1
(3-0) Hp1
 (3-0)
2 B2
(31-28) B2
(27-24) B2
(23-20) B2
(19-16) B2
(15-12) B2
(11-8) B2
(7-4) B2
(3-0) H2
(3-0) Hp2
(3-0)
3 B3
(31-28) B3
(27-24) B3
(23-20) B3
(19-16) B3
(15-12) B3
(11-8) B3
(7-4) B3
(3-0) H3
(3-0) Hp3
(3-0)
VCC V(31-28) V(27-24) V(23-20) V(19-16) V(15-12) V(11-8) V(7-4) V(3-0)
VPC Vp(31-28) Vp(27-24) Vp(23-20) Vp(19-16) Vp(15-12) Vp(11-8) Vp(7-4) Vp(3-0)
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
259
inter-word and intra-word MBUs in memory system. 
The differentiator of CMC from other coding tech-
niques is soft error prediction, which predicts the exact 
number of soft errors present in the memories before 
the correction task.
2.1 Proposed CMC encoder and decoder 
The cost of the ECC technique is directly proportional 
to the required redundant bits [11]. In the proposed 
CMC, group of words are taken as input to the encoder 
and decoder instead of single word taken in the exist-
ing works, for achieving lower redundant bits. i.e. N-bit 
words are arranged in M rows each forms a matrix of 
size M×N. Each word (row) is divided into k symbols 
of m bits N= k×m. The horizontal counter codes (HCC), 
horizontal prediction codes (HPC), vertical counter 
codes (VCC) and vertical parity codes (VPC) includes the 
vertical counter bits V(3-0)...V(31-28) and horizontal counter 
bits H0
(3-0)… H3
(3-0) for error prediction and the vertical 
parity bits VP(3-0)… VP (31-28), horizontal parity bits HP0
(3-0)….. 
HP3
(3-0) for error correction respectively. To explain the 
proposed CMC, 32-bit words are considered as an ex-
ample, arranged in 4 rows each forms 4×32 matrix as 
shown in Table I.  The required number of parity bits for 
the group length is given in Table II. It shows that more 
number of words in a group needs less number of re-
dundant bits. For example the computation of redun-
dant bits for 8 words in a group needs 64 redundant 
bits and 4 words in a two different group (2×48) is 96 
redundant bits. But more number of words in a group 
will affect the percentage of correction coverage. For 
this reason this work limits the number of words in a 
group to 4.
Table 2: Required no. of parity bits per group
No. of words per group No. Of Redundant bits
1 24
2 40
3 44
4 48
5 52
6 56
7 60
8 64
The proposed CMC has two steps, first combinational 
ones counter operation is performed on data bits for 
predicting and reducing the number of redundant bits 
for further error detection and correction. For an array 
of memory words, the horizontal (row) counter code 
bits can be calculated using Equation 1. For example 
the horizontal counter code of first row word is shown 
in (Equation 2) – (Equation 5)
 1
0
^ (4 )
k
M
k
B k m
−
=
+∑MH m =    (1)
H0
0 = B0
0 + B0
4 + B0
8 + B0
12 + B0
16 + B0
20 + B0
24 + B0
28       (2)
H0
1 = B0
1 + B0
5 + B0
9 + B0
13 + B0
17 + B0
21 + B0
25 + B0
29       (3)
H0
2 = B0
2 + B0
6 + B0
10 + B0
14 + B0
18 + B0
22 + B0
26 + B0
30      (4)
H0
3 = B0
3 + B0
7 + B0
11 + B0
15 + B0
19 + B0
23 + B0
27 + B0
31      (5)
For an array of memory words, the vertical (column) 
counter code bits are calculated using Equation 6. For 
example the vertical counter code of first column is 
shown in (Equation 7)-( Equation 10)
  
NV =
 
∑
−
=
1
0
)(^
m
m
NBm        (6)
V0 = B0
0 + B1
0 + B2
0 + B3
0     (7)
V1 = B0
1 + B1
1 + B2
1 + B3
1     (8)
V2 = B0
2 + B1
2+ B2
2 + B3
2     (9)
V3 = B0
3 + B1
3 + B2
3 + B3
3                     (10)
where k is the number of symbols in a word; m is the 
number of bits in a symbol and M is the number of 
words in the array.  In the second step horizontal and 
vertical parity bits can be calculated from the horizon-
tal and vertical counter codes. Horizontal parity bits are 
calculated from horizontal counter codes using Equa-
tion 11. Similarly, vertical parity bits are calculated from 
horizontal counter codes using Equation 12.  Finally, 
both the intra and inter word errors will be corrected 
in decoding step.    
 {   0    0, 2,  ;pM MH m for H m k= = …
 1   1,3, 1 ;  Mfor H m k= … −
                (11)
 
  {   0    0, 2,  ;pN NV m for V m k= = …
 1   1,3, 1 ;   NforV m k= … −
                (12)
The encoding and decoding algorithms are given be-
low to understand the flow.
Algorithm for Encoding.
ACW - Array of configuration word
HCC   – Horizontal (row) counter codes
VCC    – Vertical (column) counter codes
HPC    – Horizontal (row) parity codes
VCC    – Vertical (column) parity codes
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
260
Input: ACW [4 ×32=128 bits]
Output: HCC, VCC, HPC, VPC
1:  ACW to be written 
2: Split into K symbols per Configuration word
2:        while symbols = true do
3:            for all HNm  ∈ HCC do onescount (ACW);
4:                for all  Hpmm  ∈ HPC do parity(HCC);
5:            for all VN  ∈ VCC do onescount (ACW);   6:                for all VpNm  ∈ VPC do parity(VCC);
6:             update HCC, VCC, HPC, and VPC;
7:         end while
8:  return ACW;
Algorithm for Decoding.
Input : Errored ACW[4 ×32=128 bits], Hcc, Vcc, Hpc, Vpc
Output : error prediction value (epv) , corrected word 
(cw)
1:  Read errored ACW [ACWm]
2:  Split into K symbols per Configuration word
3:  Read Hcc,Vcc,Hpc,Vpc
4:     while symbols = true do
5:         for all HNm  ∈ Hccm do onescount (ACWm);
6:  for all Hpmm  ∈ Hpc do parity(Hcc’);
7:       for all VN ∈Vccm do onescount (ACWm);
8:  for all VpNm ∈ Vpc do parity(Vcc’);
9:           update Hcc’,Vcc’,Hpc’,Vpc’;
10:              find hsc= diff(Hcc-Hcc’)
11:                     find hsp= diff(Hpc-Hpc’)
12:              find vsc= diff(Vcc-Vcc’)
13:                      find vsp= diff(Vpc-Vpc’)
14:          if((hsp==0)&(vsp==0))
15:               begin
16:                   {Syndrome =0
17:                      error=0 }
18:                     end
19:                         else
20:                              begin
21:                                   {  Syndrome ≠  0
22:                                       error ≠  0
23:         epv = {hsc,vsc};
24:                               Bintracorrect = ACW  m  XOR vs
                                       Bintercorrect = ACW  m  XOR Hs   } }
25:                               end
26:               end while
22:       return ACW;
2.2  Proposed fault-tolerant memory architecture
The proposed fault-tolerant memory architecture is il-
lustrated in Figure 1. First, for the period of encoding 
process, original data bits D are fed to the encoder, 
and then HCC, HPC and VPC are obtained from the CMC 
encoder. The obtained CMC codeword consist data 
and redundancy bits, which are stored in the separate 
SRAM memories. The MBUs occurred in the memory 
is being corrected at the decoding process using the 
CMC Encode-Compare. 
Figure 1: Fault-tolerant memory architecture
The detail architecture of CMC encoder is shown in Fig-
ure 2. First, the HCC and VCC bits are computed by per-
forming 8-bit combinational counting operation of 
selected sliced bits of symbols per row and 4-bit com-
binational counting operation of selected sliced bits of 
symbols per column respectively. Second the    4-bit 
HPC are computed by performing XOR operations of re-
spective row HCCs, totally 16 bit HPCs are computed for 4 
rows. The 1-bit VPC is computed by performing XOR op-
erations of respective column VCCs, totally 32 bit VPCs 
are computed for 32 columns.
The proposed CMC Encoder consists of two combina-
tional ones counter circuits, namely 8-bit combination-
al ones counter and 4-bit combinational ones counter. 
The 8-bit combinational ones counter (Row counter) 
shown in Figure 3(a). The row counter counts the num-
ber of one’s using 9 half adders (HAs), 2 full adders (FAs) 
and 2 XOR gates and is given in (Equation 13). Similarly, 
the 4-bit combinational ones counter (Column coun-
ter) shown in Figure 3(b) counts the number of one’s 
using 4 half adders (HAs), and one XOR gate and is 
given in (Equation 14). The detail architecture of CMC 
decoder is shown in Figure 4. Decoder consists of pre-
dictor, syndrome calculator (detector), locator and cor-
rector. Horizontal and vertical syndrome calculator are 
used to detect and locate the MBUs in the memories. 
 
hgfedcbaout
fehghgfedcbadcbahgfedcbahgfedcbaout
hgfedcbahgfehgfedcbadcbaout
hgfedcbaout
.......]3[
).)(.()).().(.().()).()](...()...[()...)(...(]2[
)()).(())(.)(.).()(().().()).((]1[
)()()()(]0[
=
⊕⊕⊕⊕⊕⊕⊕⊕+=
⊕⊕⊕⊕⊕⊕⊕⊕⊕⊕⊕⊕⊕=
⊕⊕⊕⊕⊕⊕⊕=
     (13)
Figure 2:  Architecture for CMC Encoder.
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
261
(b) Column Counter  
Figure 3: 1’s counters (a) Row counter (b) Column 
Counter.
 
dcbaout
dcbadcbaout
dcbaout
...]2[
).()..().).((]1[
]0[
=
⊕⊕⊕⊕=
⊕⊕⊕=
                 (14)
Finally corrector is used to correct the erroneous bits 
based on horizontal syndrome, vertical syndrome and 
erroneous bits.
The following example gives the computation of  hori-
zontal, vertical parity bits for MBU detection and cor-
rection for a group of words. Let us consider the origi-
nal information bits (B) as 128 bits. It can be divided 
into four rows, each containing 32 bits. Each row is 
divided into 8 symbols, each containing four bits. HCC 
and VCC are horizontal ones counter (Row counter) and 
Vertical ones counter (Column counter) for predicting 
soft errors and reducing the number of redundant bits. 
An HPC and VPC bit detects and corrects the errors in 
128-bits. For example the original 128-bits information 
is shown in Table  III (a), may have intra-word errors as 
shown in Table III (b), and inter-word errors are shown 
in Table  III (c), for 128-bits information. The horizontal 
counter codes were calculated using Equation (1)-(5) 
and vertical counter codes were calculated using Equa-
tion (6)-(10). The horizontal and vertical parity bits were 
calculated using Equation (11) and Equation (12) re-
spectively. Finally, both the intra and inter word MBUs 
can be corrected by the decoding algorithm.
3 Correction coverage and overhead 
analysis
In this section, the proposed CMC has been coded in 
Verilog hardware description language (HDL), simu-
lated using Xilinx-Isim and tested its functionality for 
various inputs. The correction coverage and overhead 
analysis have been done. For fair comparisons, Ham-
ming [8] [9], MC [11], DMC [12],        SHMC [14], RMC [14], 
I3D [22], XILINX CRC [19] [20] are used for reference.
 
Out[0] Out[1] Out[2] Out[3]
(a)  Row Counter 
 
Figure 4: Architecture for CMC Decoder.
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
262
3.1 MBU Patterns
In 2009, E. Ibe et .al analyzed the scaling effects on neu-
tron induced soft error in SRAM array down to   22 nm 
technology node and they observed that nearly 50 % 
of soft errors are MBU incidents [21]. In order to fairly 
enumerate the MBU correction coverage of the pro-
posed CMC technique, the detailed information about 
the possible MBU error patterns of 28nm SRAM array 
and their individual occurrence probabilities are need-
ed. Figure 5 shows the MBU patterns and their occur-
rence probabilities [22] –[23].
3.2 Comparison for correction coverage
To facilitate the benefits and drawbacks of the pro-
posed scheme, it is extensively compared with previ-
ous techniques. Simulation based MBU injection ex-
periment has been done to extract error correction 
coverage of the previous techniques. The original 
128-bit information and the faulty information can be 
specified in the text fixture, and fault injection can be 
implemented in a test-bench. Both single and multiple 
bit faults were injected, in case of MBU injection around 
one million combinations were injected. The correction 
coverage of various MBU mitigation techniques such as 
CMC, DMC, MC, and Hamming is obtained for various 
intra-word error test cases and it is shown in Figure 6. 
It is clear that the DMC performs 100% intra error cor-
rection up to 5 bit errors and 11.8% error correction in 
16 bit errors. Similarly, MC performs 100% intra error 
correction up to 2 bits and 0.6% error correction up to 
8 bits. But the proposed CMC provides 100% protec-
tion that is possible error correction up to 32 bits. In 
addition to that the correction coverage depicted in 
Table V compares the proposed technique, proven soft 
error mitigation techniques and existing research tech-
niques.
The possibility of correction coverage is tested for larg-
er the word widths which results the higher the correc-
tion capabilities. The maximum correction capability 
(MCC) is given in Table IV.  In DMC, the correction capa-
bility for a 64- bit and 128-bit word is up to 9 bits and 
17 bits respectively. In proposed CMC, the correction 
capability for a 64-bit and 128-bit word is up to 36 bits 
Table 3: (a) 128-bit logical organization of cmc
S.NO Symbol8 Symbol7 Symbol6 Symbol5 Symbol4 Symbol3 Symbol2 Symbol1 HCC HPC
1 1010 1010 1010 1010 1010 1010 1010 1010 8080 0000
2 0101 0101 0101 0101 0101 0101 0101 0101 0808 0000
3 1001 1001 1001 1001 1001 1001 1001 1001 8008 0000
4 0110 0110 0110 0110 0110 0110 0110 0110 0880 0000
VCC 2222 2222 2222 2222 2222 2222 2222 2222
VPC 0000 0000 0000 0000 0000 0000 0000 0000
(b) Intra word error version
S.NO Symbol8 Symbol7 Symbol6 Symbol5 Symbol4 Symbol3 Symbol2 Symbol1 H’CC H’PC
1 0101 0101 0101 0101 0101 0101 0101 0101 0808 0000
2 0101 0101 0101 0101 0101 0101 0101 0101 0808 0000
3 1001 1001 1001 1001 1001 1001 1001 1001 8008 0000
4 0110 0110 0110 0110 0110 0110 0110 0110 0880 0000
V’CC 1313 1313 1313 1313 1313 1313 1313 1313
V’PC 1111 1111 1111 1111 1111 1111 1111 1111
(C) Inter word error version
S.NO Symbol8 Symbol7 Symbol6 Symbol5 Symbol4 Symbol3 Symbol2 Symbol1 H’cc H’pc
1 0101 1010 1010 1010 1010 1010 1010 1010 7171 1111
2 1010 0101 0101 0101 0101 0101 0101 0101 1717 1111
3 0110 1001 1001 1001 1001 1001 1001 1001 7117 1111
4 1001 0110 0110 0110 0110 0110 0110 0110 1771 1111
V’cc 2222 2222 2222 2222 2222 2222 2222 2222
V’pc 0000 0000 0000 0000 0000 0000 0000 0000
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
263
and 44 bits respectively. The results depicted in Table 
IV show that proposed CMC exceeds the performance 
of other codes by its efficient error tolerance capability 
against larger the MBU widths.
Table 4: Maximum Correction Capability (MCC)
Technique MCC (64-bits) MCC (128-bits)
CMC 36 bits 44 bits
RMC 16 bits 32 bits
DMC 9 bits 17 bits
MC 4 bits 8 bits
 
    
    
    
    56.7% 
 
    
    
    
      22.1% 
 
    
    
    
    3.2% 
 
    
    
    
1.9% 
    
    
    
 
1.2% 
    
    
    
    
0.2% 
 
    
    
    
     24.5% 
 
    
    
    
      56.7% 
 
    
    
    
      2.5% 
    
    
    
 
1.7% 
    
    
    
0.14% 
    
    
    
    
0.1% 
 
    
    
    
     9.3% 
 
    
    
    
       3.1%        2.1% 
    
    
    
 
    
    
    
      1.9% 
    
    
    
0.68% 
    
    
    
    
   0.01% 
    
    
    
11.2% 
    
    
    
4.2% 
 
       2.6% 
    
    
    
 
    
    
    
2.5% 
    
    
    
0.09% 
 
    
    
    
    
   0.02% 
 
    
    
    
     14.5% 
 
    
    
    
     4.2%       0.23% 
    
    
    
 
    
    
    
      1.65% 
    
    
    
      0.71% 
    
    
    
    
0.2% 
Figure 5: MBU patterns of high occurrence probabilities in 28nm SRAM array [22]-[23]
Figure 6: Intra word Correction coverage for various 
ECCs
Figure 7:  Required number of redundant bits for vari-
ous error correction codes
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
264
3.3 Comparison for overhead analysis
In order to evaluate the efficiency of error mitigation 
techniques, the implementation overheads of these 
protection codes have to be analyzed. This paper ana-
lyzes the overheads in terms of cost and correction cov-
erage per cost (CCC). The term cost indicates the num-
ber of redundant bits required to implement the error 
correction codes [11].  The cost for the proposed and 
typical coding techniques is portrayed for 32, 64 and 
128 bits in Figure 7. This implies Hamming code needs 
very less number of redundant bits, but their correction 
capability is limited to 1. DMC need more number of 
redundant bits compared to all other codes. Linear in-
creasing of redundant bits for the higher word lengths 
of the traditional codes were shown in Figure 7. The 
proposed CMC needs less number of redundant bits 
compared to all other codes due to the inter word pro-
cessing capability. The CCC results of the proposed and 
typical coding techniques are portrayed up to 32 bits 
in a word is shown in Figure 8. This implies that coding 
techniques should have high value of the CCC for high-
er reliable solution. It should be noticed that when the 
number of errors is more than one per word, Hamming 
code cannot correct any errors. The proposed CMC pro-
vides consistent performance compared to all typical 
coding techniques. Thus, based on the analysis given in 
Figure 7 and Figure 8, the proposed CMC technique is 
better suited for low cost and safety critical (high- Per-
formance) applications.
The best metric used to select the appropriate coding 
technique for the practical solutions is mean time to 
repair (MTTR) which is analyzed for all soft error mitiga-
tion techniques and portrayed in Table V. MTTR-R rep-
resents the actual MTTR and additional recovery time. 
The results shown in the Table V implies that proven 
mitigation techniques [19], Xilinx CRC+ECC [20] needs 
minimum MTTR value, but the correction coverage for 
the recent scaled technology (28 nm) is not satisfacto-
ry. The technique presented in the  Xilinx CRC+Reload 
[20] gives 100% correction coverage, but they require 
MTTR as almost 3-times of the other techniques and 
this MTTR overhead is not acceptable in real time. Next 
the coding techniques presented in the [14] require 
minimum MTTR due to Encode-Compare mechanism, 
Figure 9: Intra word Memory Area overhead analysis of 
various Xilinx FPGA Devices.
Figure 8:  Correction coverage per cost for various er-
ror correction codes
Table 4: Comparison of different soft error mitigation techniques
Soft Error Correction 
Techniques
MTTR
(ms)
MTTR-R
(ms)
Correction 
coverage (%)
Distinguished Note
Proven Mitigation Techniques
Xilinx SEU Correction [19] 9.342 0 51.72 Single bit correction
 Xilinx CRC+ECC [20] 9.342 0 61.1 Global detection & Single bit 
correction
Xilinx CRC +Reload [20] 9.342 18.7 100 External Storage required
Existing Research Techniques
Hamming code[8],[9] 10.7 0 51.652 Decode-Compare
DMC [12] 9.6 0 95.823 Decode-Compare
MC [11] 11.2 0 93.81 Decode-Compare
SHMC [14] 6.57 0 95.913 Encode-Compare
RMC[14] 6.68 0 94.62 Encode-Compare
I3D[22] 9.343 0.351 94.2 Erasure code
Proposed Technique
CMC[Pro] 9.387 0 100 Prediction & Encode-
Compare
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
265
but the correction coverage is not a maximum. DMC 
technique requires 9.6 ms for correcting the errors 
and the respected correction coverage is only about 
95.823% [12]. The recent technique I3D requires 9.343 
ms for detecting the error and 0.351 ms for recover 
the particular error word, the total MTTR is 9.694 ms 
and the respected correction coverage is only about 
94.2% [22]. The proposed CMC require only 9.387 ms 
for correcting all error patterns shown in the Figure 7 
and this MTTR value is almost equivalent to the proven 
techniques .Thus the proposed CMC technique can 
be used in safety critical applications compared to all 
typical coding techniques. Finally memory overhead 
for storing the redundant bits in Xilinx FPGA devices 
are shown in Figure 9. This implies that Hamming code 
need minimum memory overhead but the correction 
capability is limited to 1. The proposed CMC and the 
SHMC technique presented in [14] are require accept-
able level of redundant memory overhead compared 
to all other codes.
4 Conclusion
In this paper, a novel technique CMC is proposed to 
cope with radiation induced MBUs. The obtained re-
sults showed that the proposed scheme has a better 
protection level against huge MBUs in the intra and in-
ter words of the memory. The proposed CMC utilized 
Encode-Compare mechanism to predict  and correct 
errors for a group of words, so that the MTTR value is 
minimum and equivalent to proven mitigation tech-
niques with improved correction coverage. The only 
drawback of the proposed work is the requirement of 
more redundant bits to protect memory. In future the 
research will be conducted for improving reliability and 
reducing cost of the proposed technique for the below 
28 nm FPGAs.
 
5 Acknowledgements
This work was supported in part by the University Grant 
Commission (UGC), Government Of India, National Fel-
lowship under Grant NFO25109.
6 References
1. C. Argyrides, C. Lisboa, L. Carro and D.K. Pradhan, 
“ A soft error robust and power aware memory 
design ” in Proc. 20th Annu, Symp, Integr, Circuits 
Syst Des (SBCCI), Sep.2007, pp.300–305. www.inf.
ufrgs.br/~calisboa/.../SlidesSBCCI2007ETLPRAM.
pdf
2. M.J. Wirthlin, “FPGAs Operating In A Radiation En-
vironment: Lessons Learned From FPGA In Space,” 
workshop on electronics for particle physics, Oxford, 
U.K , September 2012, pp. 17–21. https://indico.cern.
ch/event/.../twepp_wirthlin_Sept_2012.ppt.pdf
3. Xilinx, “Device Reliability Report, UG116, v10.1”, 
August. 2014. juhuj.com/open-file-pdf-convert-
pdf-download-ug116.htm
4. D. Radaelli, H. Puchner, S. Wong, and S. Daniel, “In-
vestigation of multi-bit upsets in a 150 nm tech-
nology SRAM device,” IEEE Trans.Nucl. Sci., vol. 52, 
no. 6, pp. 2433–2437, Dec. 2005. ieeexplore.ieee.
org/document/1589220/
5. R. C. Baumann, “Radiation-induced soft er-
rors in advanced semiconductor technologies,” 
IEEE Trans. Device Mater. Rel., vol. 5, no.3, pp. 
305–316, Sep. 2005. ieeexplore.ieee.org/docu-
ment/1545891/
6. ITRS 2002. [Online]. Available: http://public.itrs.net
7. P. M. B. Rao, M. Ebrahimi, R. Seyyedi, and M. B. 
Tahoori, “Protecting SRAM-based FPGAs against 
multiple bit upsets using erasure codes,” in Proc. 
51st ACM/EDAC/IEEE Design Autom. Conf. (DAC), 
Jun. 2014, pp. 1–6. http://ieeexplore.ieee.org/
document/6881539/?reload=true&arnumb
er=6881539
8. A. Sanchez-Macian, P. Reviriego, J.A. Maestro, 
“Hamming SEC-DAED and Extended Hamming 
SEC-DED-TAED Codes Through Selective Shorten-
ing and Bit Placement,” IEEE Trans. Device Mater. 
Rel. ,vol.14,no.1,pp.574-576,March2014. http://
ieeexplore.ieee.org/document/6217302/
9. D. Houghton, “The Engineer’s Error Coding Hand-
book”. Chapman and Hall, London, U.K , 1997. 
www.springer.com/gp/book/9780412790706
10. P. Reviriego, M. Flanagan, and J. A. Maestro, “A 
(64,45) triple error correction code for memory 
applications,” IEEE Trans. Device Mater. Rel., vol. 
12, no. 1, pp. 101–106, Mar. 2012. ieeexplore.ieee.
org/document/6026914/
11. C. Argyrides, D. K. Pradhan, and T. Kocak, “Ma-
trix codes for reliable and cost efficient memory 
chips,” IEEE Trans. Very Large Scale Integr. (VLSI) 
Syst., vol. 19, no. 3, pp. 420–428, Mar. 2011. ieeex-
plore.ieee.org/document/5352255/
12. Jing Guo, Liyi Xiao, Zhigang Mao, Qiang Zhao, “En-
hanced Memory Reliability Against Multiple Cell 
Upsets Using Decimal Matrix Code,” IEEE Trans. 
Very Large Scale Integr.(VLSI) Syst., vol.22, no.1, 
pp.127-135, Jan. 2014. http://ieeexplore.ieee.org/
document/6487418/
13. R. Naseer and J. Draper, “Parallel double error cor-
recting code design to mitigate multi-bit upsets 
in SRAMs,” in Proc. 34th Eur. Solid-State Circuits, 
Sep. 2008, pp. 222–225. www.isi.edu/~draper/pa-
pers/esscirc08.pdf
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266
266
14. A. Ahilan, P. Deepa, “Design for Built-In FPGA Relia-
bility via Fine-Grained 2-D Error Correction Codes”, 
Microelectronics Reliability, vol. 55, pp. 2108-2112, 
Aug. –Sep. 2015. http://www.sciencedirect.com/
science/article/pii/S0026271415001675?np=y
15. G. Neuberger, D. L. Kastensmidt, and R. Reis, “An 
automatic technique for optimizing Reed-Solo-
mon codes to improve fault tolerance in memo-
ries,” IEEE Design Test Comput., vol. 22, no. 1, pp. 
50–58, Jan.–Feb. 2005. https://www.lume.ufrgs.
br/bitstream/handle/10183/27598/000459042.
pdf?sequence=1
16. S. Liu, P. Reviriego, and J. A. Maestro, “Efficient 
majority logic fault detection with difference-set 
codes for memory applications,” IEEE Trans. Very 
Large Scale Integr. (VLSI) Syst., vol. 20, no. 1, pp. 
148–156, Jan. 2012. http://iosrjournals.org/iosr-
jece/papers/Vol8-Issue2/M0827178.pdf
17. Appathurai, A.; Deepa, P., “Design for reliablity: A 
novel counter matrix code for FPGA based quality 
applications,” in Proc. 6th Asia Symposium on Qual-
ity Electronic Design (ASQED), Aug. 2015, pp.56-61. 
http://ieeexplore.ieee.org/document/7274007/
18. L. Jones, “Single event upset (SEU) detection and 
correction using Virtex-4 devices,” Xilinx Corpo-
ration, San Jose, CA, USA, Appl. Note XAPP714, 
2007. http://www.eng.auburn.edu/~strouce/class/ 
bist/CATA09seu.pdf
19. Xilinx, “LogiCORE IP soft error mitigation control-
ler, PG036, v3.4” San Jose, CA, USA, 2012. www.
xilinx.com/support/documentation/ip.../v3_4/
pg036_sem.pdf
20. E. Ibe, H. Taniguchi, Y. Yahagi, K. Shimbo, and T. 
Toba, “Impact of scaling on neutron induced soft 
error in SRAMs from an 250 nm to a 22 nm de-
sign rule,” IEEE Trans. Electron Devices, vol. 57, no. 7, 
pp. 1527–1538, Jul. 2010. http://ieeexplore.ieee.
org/document/5467170/ 
21. E. Costenaro, D. Alexandrescu, K. Belhaddad, and 
M. Nicolaidis, “A practical approach to single event 
transient analysis for highly complex design,” J. 
Electron. Test., vol. 29, no. 3, pp. 301–315, 2013. 
http://ieeexplore.ieee.org/document/6104439/
22. M. Ebrahimi, P.M.B. Rao,; R. Seyyedi,; M.B . Tahoori, 
“Low-Cost Multiple Bit Upset Correction in SRAM-
Based FPGA Configuration Frames,” IEEE Trans. 
Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 1, 
pp. 148–156, Jan. 2012. http://ieeexplore.ieee.
org/document/7104165/
23. JEDEC89C Standard, [Online]. Available: http://
www.jedec.org/standards-documents, accessed 
Apr. 2015.
Arrived: 26. 09. 2016
Accepted: 13. 12. 2016
A. Ahilan et al; Informacije Midem, Vol. 46, No. 4(2016), 257 – 266