12
Design of Reliable Memories with Self-Error-Correction by Group Testing Based Non-Binary Codes 1 Lake Bu, Student Member, IEEE, Mark Karpovsky, Life Fellow, IEEE Reliable Computing Laboratory, Electrical and Computer Engineering Boston University, Boston, USA Abstract—In this paper we propose a new way of designing reliable memory with a class of group testing based error cor- recting codes. Instead of conventional single and double-bit error correction, it provides much stronger reliability against multi-byte errors. To correct m errors in its Q-ary codewords with length N, the hardware overhead is just O(mNlogQ), and its latency is almost negligible. Together with theoretical proof, this paper also provides a detailed analysis on the hardware architecture of the reliable memories based on these codes. Both explain the reasons why this new design is cost-effective. Comparing to the hardware decoders of most popular byte-error correcting codes requiring finite field multipliers and dividers in GF(Q), the proposed approach only involves simple binary operations, leading to considerable savings in the time and hardware cost. We also bring in a new concept of 1-step threshold decoding, which not only simplifies the decoder’s hardware, but also generalizes the relationship between the error correcting probability and the size of redundancy. The new technique proposed in this paper can be used as a replacement of current error correcting codes in the design of low-cost and high reliability memories, such as cache, RAM and Flash memories. Index terms — reliable design, memory, redundancy, fault tolerance, error correction, GTB codes, group testing, superimposed codes, threshold decoding. I. I NTRODUCTION The VLSI industry grows exponentially in recent years. The semiconductor products now have much higher integration and speed by the increasing density of transistors on chip and clock frequency. Their sizes are still reducing and the perfor- mance still improving. However, the growth of integration and speed in electronic components will also cause instability and increase in the probability of errors. It is extremely important to provide higher reliability and faster correction against errors with relatively low overhead to those systems. To enable the electronic systems with reliability on data level, error control codes (ECC) are commonly used, which are capable of dealing with random soft errors [1]. For memories, bit-error correcting codes are mostly adapted, among which single and double-bit ECCs are the majority. However, nowa- days most memories are byte-organized or word-organized. Meaning each byte contains b bits, and single error can affect the whole byte. Moreover, for large-capacity and high-speed memories, due to their high-density nature, they are strongly vulnerable to particles [2]. So it is not rare that even multiple b-bit bytes can be distorted [3], [4]. This indicates that bit- level error correction may be insufficient due to the increasing 1 This work was sponsored by the NSF grant CNS 1012910. probability of multi-byte errors in memories. It is also essential that the proposed design has relatively small redundancy, hardware overhead, and latency [5]. Generally speaking, the decoding complexity of non-binary Q-ary ECCs is determined by three factors: their error correct- ing capability m (number of defected bytes to be located and corrected), the size of the finite field GF(Q), where Q =2 b , and N the number of digits/bytes in each codeword. For byte-error correction, Reed-Solomon (RS) codes and non-binary Hamming codes are known for having the smallest code redundancy (minimal number of redundant digits). How- ever, their decoding complexity is relatively high. They require multiplications and divisions over finite fields. Therefore the hardware cost of their decoders grows proportionally to at least b 2 as the byte size b = log 2 Q grows (all the logs in this paper are base 2, unless specified). Another drawback of conventional non-binary ECCs is their decoding latency. For example, the decoding of Reed- Solomon (RS) codes based on Berlekamp-Massey algorithm and Chien search [6], [7], requires 2m 2 +9m +3+ N clock cycles for m-error locating and correction in an N- byte codeword [8]. Even the single-byte ECCs such as non- binary Hamming codes require considerable time on finite field multiplications and divisions to correct errors. As the memories of today’s computers and smart devices are all working at high speed, such latency is hardly acceptable. In response to the disadvantages of the conventional byte error correcting codes, interleaved codes are invented as an alternative. For single-byte error correcting there are used Hamming codes [9]. For multi-byte error there are interleaved Orthogonal Latin Square Codes (OLSCs) [10]. These codes in- terleave the original b-bit bytes of each codeword, and operate binary error correction for b times before all the bits are de- interleaved to restore the original bytes. This technique is high- speed and avoids complicated multiplications and divisions over GF(Q) fields at the cost of b decoders which results in a high complexity of decoding. It is obvious that the design of reliable memories is a trade-off between the number of redundant digits in the corresponding code and complexity of decoding, which reflects on hardware and time overhead. Based on the above criteria, in this paper we propose a new class of a group testing based Q-ary (Q =2 b ) error correcting codes: GTB codes. The check matrices of GTB codes are generated from binary superimposed codes. This enables low- complexity decoding with mere binary operations. By this proposed class of codes, the hardware overhead is as low as

Design of Reliable Memories with Self-Error-Correction by Group … · 2018. 3. 12. · Design of Reliable Memories with Self-Error-Correction by Group Testing Based Non-Binary Codes

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Design of Reliable Memories withSelf-Error-Correction by Group Testing Based

    Non-Binary Codes 1

    Lake Bu, Student Member, IEEE, Mark Karpovsky, Life Fellow, IEEEReliable Computing Laboratory, Electrical and Computer Engineering

    Boston University, Boston, USA

    Abstract—In this paper we propose a new way of designingreliable memory with a class of group testing based error cor-recting codes. Instead of conventional single and double-bit errorcorrection, it provides much stronger reliability against multi-byteerrors. To correct m errors in its Q-ary codewords with lengthN, the hardware overhead is just O(mNlogQ), and its latencyis almost negligible. Together with theoretical proof, this paperalso provides a detailed analysis on the hardware architectureof the reliable memories based on these codes. Both explainthe reasons why this new design is cost-effective. Comparing tothe hardware decoders of most popular byte-error correctingcodes requiring finite field multipliers and dividers in GF(Q),the proposed approach only involves simple binary operations,leading to considerable savings in the time and hardware cost. Wealso bring in a new concept of 1-step threshold decoding, whichnot only simplifies the decoder’s hardware, but also generalizesthe relationship between the error correcting probability and thesize of redundancy. The new technique proposed in this papercan be used as a replacement of current error correcting codesin the design of low-cost and high reliability memories, such ascache, RAM and Flash memories.

    Index terms — reliable design, memory, redundancy,fault tolerance, error correction, GTB codes, group testing,superimposed codes, threshold decoding.

    I. INTRODUCTIONThe VLSI industry grows exponentially in recent years.

    The semiconductor products now have much higher integrationand speed by the increasing density of transistors on chip andclock frequency. Their sizes are still reducing and the perfor-mance still improving. However, the growth of integration andspeed in electronic components will also cause instability andincrease in the probability of errors. It is extremely importantto provide higher reliability and faster correction against errorswith relatively low overhead to those systems.

    To enable the electronic systems with reliability on datalevel, error control codes (ECC) are commonly used, which arecapable of dealing with random soft errors [1]. For memories,bit-error correcting codes are mostly adapted, among whichsingle and double-bit ECCs are the majority. However, nowa-days most memories are byte-organized or word-organized.Meaning each byte contains b bits, and single error can affectthe whole byte. Moreover, for large-capacity and high-speedmemories, due to their high-density nature, they are stronglyvulnerable to particles [2]. So it is not rare that even multipleb-bit bytes can be distorted [3], [4]. This indicates that bit-level error correction may be insufficient due to the increasing

    1This work was sponsored by the NSF grant CNS 1012910.

    probability of multi-byte errors in memories.It is also essential that the proposed design has relatively

    small redundancy, hardware overhead, and latency [5].Generally speaking, the decoding complexity of non-binary

    Q-ary ECCs is determined by three factors: their error correct-ing capability m (number of defected bytes to be located andcorrected), the size of the finite field GF(Q), where Q = 2b,and N the number of digits/bytes in each codeword.

    For byte-error correction, Reed-Solomon (RS) codes andnon-binary Hamming codes are known for having the smallestcode redundancy (minimal number of redundant digits). How-ever, their decoding complexity is relatively high. They requiremultiplications and divisions over finite fields. Therefore thehardware cost of their decoders grows proportionally to at leastb2 as the byte size b = log2Q grows (all the logs in this paperare base 2, unless specified).

    Another drawback of conventional non-binary ECCs istheir decoding latency. For example, the decoding of Reed-Solomon (RS) codes based on Berlekamp-Massey algorithmand Chien search [6], [7], requires 2m2 + 9m + 3 + Nclock cycles for m-error locating and correction in an N-byte codeword [8]. Even the single-byte ECCs such as non-binary Hamming codes require considerable time on finite fieldmultiplications and divisions to correct errors. As the memoriesof today’s computers and smart devices are all working at highspeed, such latency is hardly acceptable.

    In response to the disadvantages of the conventional byteerror correcting codes, interleaved codes are invented as analternative. For single-byte error correcting there are usedHamming codes [9]. For multi-byte error there are interleavedOrthogonal Latin Square Codes (OLSCs) [10]. These codes in-terleave the original b-bit bytes of each codeword, and operatebinary error correction for b times before all the bits are de-interleaved to restore the original bytes. This technique is high-speed and avoids complicated multiplications and divisionsover GF(Q) fields at the cost of b decoders which results in ahigh complexity of decoding.

    It is obvious that the design of reliable memories isa trade-off between the number of redundant digits in thecorresponding code and complexity of decoding, which reflectson hardware and time overhead.

    Based on the above criteria, in this paper we propose a newclass of a group testing based Q-ary (Q = 2b) error correctingcodes: GTB codes. The check matrices of GTB codes aregenerated from binary superimposed codes. This enables low-complexity decoding with mere binary operations. By thisproposed class of codes, the hardware overhead is as low as

  • O(mNb). In contrast, popular codes such as Reed-Solomoncodes operate under a decoding complexity proportionally toat least b2. The GTB codes’ decoding latency is negligible byits fully combinational network, while non-binary Hammingand RS codes take much more as mentioned above. The GTBcodes require more redundancy than non-binary Hamming andRS codes, which is expected. Nevertheless, their redundancyis still much smaller than the interleaved codes. And it takesonly one set of decoder for error correction over b-bit digits,while the interleaved codes take b sets. These advantages makeGTB codes a promising low-cost and high-reliability ECC forthe design of reliable memories.

    The rest of the paper is organized as follows. Sections IIto VII are all about the mathematical concepts and theoremsdescribing the GTB codes. In Section II, the definition andconstruction of GTB codes’ check matrices is introduced. InSection III we will define the GTB codes and discuss theiroptimal parameters. In Section IV its encoding algorithm willbe explained. In Section V, we will develop the error locatingand correcting algorithms. In Section VI, a new conceptof 1-step threshold decoding is introduced to simplify andgeneralize the decoding procedure of GTB codes. In SectionVII, as two most important cases from the practical point ofview, single and double-byte error correction are discussed.

    From Section VIII to IX, the GTB codes will be comparedin many aspects with other classical codes, such as non-binaryHamming, Reed-Solomon, and interleaved codes. Section VIIIfocuses on the code rates. Section IX is devoted to comparisonof error detection and correction probability between GTBcodes and other popular ECCs.

    Section X is the hardware implementation of GTB codes’decoder, along with its experimental results. This section showthat the hardware cost of its decoder is linear to the codewordsize and its error correcting capability. In addition to that, thedecoder’s combinational network has little latency.

    Section XI summarizes the properties and advantages ofthe proposed memory architecture based on the GTB codes.It is also suggested that the GTB codes can be used as areplacement of current ECCs in the design of low-cost andhigh-reliability memories.

    II. THE CHECK MATRICES OF GTB CODESAs we will see below the check matrices of Q-ary GTB

    codes have as its elements zeros and ones only. The non-binarycodes with multi-byte error correcting capability provide highreliability, while the binary check matrices ensure low decod-ing complexity. The only cost associated with GTB codes islarger number of redundant digits comparing to Hamming andRS codes which are known for best code rates.

    A. The Definition of Superimposed CodesSuperimposed codes will be used for the binary check

    matrix construction for GTB codes.Definition 2.1: Let Mi,j ∈ {0, 1} be the element in row i

    and column j in a binary matrix M of size A × N . The setof columns of M is called an m-superimposed code, if for anyset T of up to m columns and any single column h /∈ T , thereexists a row k in the matrix M, for which Mk,h = 1 for columnh, and Mk,j = 0 for all j ∈ T [11], [12].

    The above property is called zero-false-drop of order m.Definition 2.2: For any two A-bit binary vectors u and v,

    we say that u covers v if u · v = u, where · denotes the dot

    product of the two vectors. An (A×N) matrix M is m-disjunctif the bitwise OR of any set of no more than m columns doesnot cover any other single column that is not in the set. Thecolumns of an m-disjunct matrix compose an m-superimposedcode [13].

    For example, the columns of the following matrix are 1-disjunct superimposed code:

    M =

    ∣∣∣∣∣∣∣∣∣1 0 0 00 0 1 01 1 1 10 0 0 10 1 0 0

    ∣∣∣∣∣∣∣∣∣It follows that, the superimposed codes are uniquely de-

    codable of order m as defined below.Definition 2.3: For all the N columns in M, the Boolean

    ORs of up to any m columns are all different.The property above is also called m-separable. It has been

    proved that the matrix M constructed from superimposedcodes is not only m-separable but also m-disjunct [13].

    Because of its zero-false-drop and uniquely decodableproperties, superimposed codes are often used in non-adaptivegroup testings. For an (A×N ) m-superimposed code matrix,A is called the number of tests needed to find out m defectiveitems, and N the number of items in the tests. The rows of Mare test patterns and the columns of M indicate which testseach item will be involved.

    Since the test syndrome of m defective items upon A testsis essentially the bitwise ORs of those m columns of M , thenby the unique A-bit syndrome it will be able to locate the mdefected items.

    B. NotationsBefore describing the construction of superimposed codes,

    which will be used for the construction of check matrices forGTB codes, we introduce the following notations:

    ◦ nq: the total number of digits in codewords of a q-ary(nq, kq, dq)q code Cq;

    ◦ kq: the number of information digits in Cq;◦ rq = nq − kq: the number of redundant digits in Cq;◦ dq: the Hamming distance between codewords in Cq;◦ A: the length of codewords in a superimposed code CSI ;◦ N = |CSI |: the number of codewords in CSI ;◦ M : the binary matrix of size A×N whose columns are

    codewords in CSI as its columns;◦ dSI : the distance between codewords of CSI ;◦ m: the number of errors to be corrected by GTB codes;◦ l: the maximum Hamming weight of rows in M ;

    C. Construction of Superimposed CodesConstruction 2.1: Let Cq be a (nq, kq, dq)q q-ary (q = ps

    is a power of prime and p 6= 2) conventional error correctingcode. Each digit of Cq in GF (q) is represented by a q-bitbinary vector with Hamming weight one. A superimposedcode CSI can be constructed by substituting every q-arydigit of codewords in Cq by its corresponding binary vector.The resulting m-superimposed code CSI has the followingparameters [15]:

    A = qnq;

    N = qkq ;

    l = qkq−1; (1)

  • dSI = 2dq;

    m =

    ⌊nq − 1nq − dq

    ⌋.

    If Cq is a maximal-distance separable (MDS) q-nary code,for which dq = rq + 1, such as RS codes, then m can bewritten as [13]:

    m =

    ⌊nq − 1kq − 1

    ⌋. (2)

    The codewords of the m-superimposed code CSI form thecolumns of an A×N matrix M. In every row there are l onesand every column contains exactly nq ones.

    Example 2.1: An extended ternary Reed-Solomon code hasits parameters (nq, kq, dq)q = (3, 2, 2)3. The codewords are:

    Cq = {(0, 0, 0), (0, 1, 2), (0, 2, 1), (1, 0, 2),(1, 1, 1), (1, 2, 0), (2, 0, 1), (2, 1, 0), (2, 2, 2)}.

    Suppose 0, 1, 2 are represented by 3-bit binary vectors(100), (010), (001) respectively. Then the 2-superimposed codeCSI consisting of the following N = 9 codewords can be listedas the columns of a 9× 9 matrix M.

    Then N = 9, A = 9, dSI = 4. According to (1), m = 2and the code CSI is a 2-superimposed code.

    M =

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

    1 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 11 0 0 0 0 1 0 1 00 0 1 0 1 0 1 0 00 1 0 1 0 0 0 0 1

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣A generated from Construction 2.1 is usually not the

    minimal according to the lower bound in sub-section B. Otherconstructions can have smaller A (see e.g. [11] [12]) but willresult in a larger decoding complexity for GTB codes.

    As we will see in Section V, with the columns of Mbeing codewords of superimposed codes, Construction 2.1of the check matrices makes it possible for GTB codes tohave low-complexity decoding. With the low-density binarymatrices, only a small number of binary operations are neededfor multi-byte error correction. This makes it a low-cost andhigh-speed ECC in contrary to most popular non-binary ECCsrequiring extensive finite field multiplications and divisions. Itwill be explained and proved in details in Sections IV and Vdiscussing encoding and decoding algorithms for GTB codes.

    III. GROUP TESTING BASED (GTB) CODESIn the previous section the check matrices M of GTB codes

    with columns being codewords of superimposed codes havebeen defined. The new non-binary linear Group Testing Based(GTB) Q-ary (Q = 2b) codes are based on these binary checkmatrices.

    A. NotationsIn addition to the notations introduced in Section II, the

    following notations will be used in this section.In order to help with definitions, theorems, and their proof,

    in addition to Section II.C, we have the notations below:◦ Mi,∗: the ith row of M;◦ M∗,j : the jth column of M;◦ N : the length of a (N,K,D)Q GTB codeword V ;◦ K: the number of information digits in a GTB code-

    word;◦ R: the number of redundant digits in a GTB codeword;◦ b: the number of bits in a Q-ary GTB codeword’s digit

    (byte) over GF(Q): b = log2Q;◦ λ: the maximal number of ones in common between

    any two columns in M: |M∗,i · M∗,j | ≤ λ for i, j ∈1, 2, ..., N , where · is bitwise AND [16];

    ◦ ⊕: the bitwise XOR operator;◦ Blocks: M can be partitioned into nq sub-matrices, such

    that each one has exactly q rows. Each sub-matrix iscalled a block Bt — a set of q rows where Bt = {Mi,∗ |di/qe = t}, t ∈ {1, 2, ..., nq}. Each row in a block hasexactly l ones and each column has exactly 1 one.

    B. Definition of GTB CodesDefinition 3.1: Let M be an A × N binary matrix whose

    columns are all codewords of an m-superimposed code. V iscalled a GTB code if V = {v|M · v = 0}, v ∈ GF (QN ).

    Remark 3.1: Since the check matrix M of a Q-aryGTB code is a binary matrix, even for Q-ary GTB codes(Q = 2b), the syndrome computation is actually as simpleas additions in GF (2b), namely bitwise XORs (denoted by⊕). No multiplications or inversions in finite filed GF (2b)are involved in decoding procedures. Moreover, the numberof XORs performed in M · v is determined by the numberof ones in M , which is always a low-density matrix. For theoptimal construction of GTB codes described in the next sub-section, the fraction of ones in the check matrix M is N−0.5.Thus these properties make it possible to have large savingson hardware and time cost in the design for reliable memorieswith GTB codes.

    C. The Optimal Construction of GTB CodesWe will say that a GTB code is optimal if it results in a

    minimal complexity of decoding. As discussed in Remark 3.1,the hardware cost of decoding is determined by the numberof ones in M . There are A rows in M and each row has lones, making A · l ones in total in the check matrix. It is thenumber of bitwise XORs required to compute the syndrome.

    Definition 3.2: Given N the length of codewords and mthe maximal number of errors to be corrected, the optimalGTB codes provides for the minimal A · l. This guidelinewill provide the smallest hardware overhead in the design ofreliable memories using GTB codes.

    Construction 2.1 shows that M can be constructed fromconventional q-ary error correcting codes. One of the mostpopular multi-byte ECCs is the BCH code, and the widelyused RS code is its special case. Thus we will start from theconstruction based on BCH codes.

    Given N and m, by the properties of BCH codes [17] and(1), the parameters of the corresponding BCH codes and thehardware complexity computing the syndromes are:

    nq = kq + rq; kq = logq N ; rq = (dq − 2)i+ 1;

  • m =

    ⌊nq − 1nq − dq

    ⌋; l = qkq−1;

    A = qnq = (kq + (dq − 2)i+ 1)q;

    A · l = (kq + (dq − 2)i+ 1)qkq . (3)

    Reed-Solomon is a special case of BCH codes when i = 1.It is also an MDS code where dq = rq+1. Therefore for Reed-Solomon codes the above equations can be rewritten as:

    nq = kq + rq; kq = logq N ; rq = dq − 1;

    m =

    ⌊nq − 1kq − 1

    ⌋; l = qkq−1;

    A = qnq = (kq + dq − 1)q;

    A · l = (kq + dq − 1)qkq . (4)

    It is obvious from (4) that to correct the same number oferrors in codewords of the same length, RS codes with i = 1have the smallest A and A · l among all other BCH codes (seeFig. 1). Therefore we will use Reed-Solomon as the base codeto construct the check matrices for GTB codes.

    Fig. 1: Decoding complexity (hardware cost in the number ofequivalent 2-input gates) comparison between GTB codes constructedfrom RS codes and BCH codes under the same N and m.

    Remark 3.2 Since Reed-Solomon codes are MDS codes,it has maximal dq . Under the same nq , it is able to generatelarger m than other non-MDS codes by : m =

    ⌊nq−1nq−dq

    ⌋[18].

    The optimal (N,K,D)Q GTB code V has the parametersgiven by the following theorem:

    Theorem 3.1: Given N the length of codewords and m thenumber of errors to be corrected, the optimal Q-ary GTB codeand its check matrix with minimal A · l can be constructed bythe Reed-Solomon code with the following parameters:

    (nq, kq, dq)q = (m+ 1, 2,m)q; (5)

    Then N = q2;A = q(m+ 1), l = q.So the minimal syndrome computation complexity is:

    A · l = (m+ 1)q2 = (m+ 1)N. (6)

    Proof :According to Construction 2.1 and (1), we have:

    A · l = q · nq · qkq−1 = nqqkq = nqN

    Since N is given, it comes down to find the minimal nq .For RS codes, we have:

    m =

    ⌊nq − 1kq − 1

    ⌋When m is given, the problem comes down to find the

    minimal kq , which is obviously kq = 2.Then by substituting it to all other equations:

    nq = m+ 1; dq = m; N = q2; l = q;

    A · l = (m+ 1)N. �

    D. The Parameters of Optimal GTB CodesWith the optimal parameters presented above, it can be

    proved that the GTB codes have the following properties.Theorem 3.2: If a Q-ary GTB code V of length N is

    defined by V = {v|M · v = 0}, v ∈ GF (QN ), where Mis generated by Construction 2.1 and Theorem 3.2, then V hasthe parameters (N,K,D)Q = (q2, q2−q(m+1)+m, 2m+2)Q.It is able to detect up to 2m + 1 errors and correct up to merrors.

    Proof :The redundancy R of a code is equal to the number of

    linearly independent rows in its check matrix. By Construction2.1 and the definition of blocks, M has nq blocks and the sumof all rows in each block is always a vector of all ones.

    Therefore we only need to remove any one row eachof nq − 1 blocks to make the rest of the rows all linearlyindependent. Also from (5), nq − 1 = m, so that:

    R = A− (nq − 1) = qnq −m = q(m+ 1)−m;

    K = N −R = q2 − q(m+ 1) +m.

    If M is constructed from an m-superimposed code, in everycolumn there are nq = m+ 1 number of ones. In Section IIIA, λ is defined as the maximal number of ones in commonbetween any two columns. From Construction 2.1:

    λ = nq − dq.

    Since for RS codes dq = rq + 1,

    λ = nq − rq − 1 = kq − 1.

    From Theorem 3.1 the optimal kq = 2, thus:

    λ = 1. (7)

    For any two columns of M , since λ = 1, the bitwise XORof them can generate at most one 0 from two ones in the samelocation of these 2 columns. We call this a cancellation. Forx columns, the maximal number of cancellations is

    (x2

    ).

    On the other hand, since each of these x columns has m+1ones, the maximal number of ones in all x columns is x(m+1). For each block there needs to be at least 1 cancellation tomake their bitwise XOR equal to zero. Therefore the sum canhave at most x(m+ 1)− (m+ 1) ones.

    To make the number of cancellations greater than or equalto the number of ones in the sum of x columns, we have:(

    x

    2

    )≥ (m+ 1)x− (m+ 1);

    ⇒ x2 − (2m+ 3)x+ 2(m+ 1) ≥ 0.

    Solving this quadratic equation we have x ≥ 2m+ 2.

  • This shows that for a matrix M constructed by an m-superimposed code with the parameters from (5), it takes atleast 2m+2 columns to make their bitwise XOR sum a vectorof all zeros. Meaning for GTB codes:

    D = 2m+ 2.

    Therefore given N and m, the parameters of the corre-sponding optimal GTB code are:

    (N,K,D)Q = (q2, q2 − q(m+ 1) +m, 2m+ 2)Q. �

    It is notable that the optimal parameters of an (N,K,D)Qm-error correcting GTB codes do not depend on Q.

    Corollary 3.1: The rate of GTB codes is:

    K

    N= 1− A−m

    N= 1− (m+ 1)q −m

    q2;

    It is obvious that when:

    N = q2 →∞ and mq→ 0;

    We have:K

    N→ 1.

    Corollary 3.2: If K and m are given, then the optimal qoptminimizing A · l will be [19]:

    qopt =

    ⌈(m+ 1) +

    √(m+ 1)2 + 4(K −m)

    2

    ⌉ps

    . (8)

    (where qopt is the nearest power of prime that is larger thanthe value inside d e)

    Proof :Since N = K + R = qkq , R = A −m, and A = q · nq ,

    by substituting R and A into N , we have:

    qkq − q · nq −K +m = 0.

    From Theorem 3.2 the optimal kq and nq are given as:

    kq = 2; nq = m+ 1.

    By solving the first quadratic equation, the optimal qoptwhen K and m are given is (8). �

    Remark 3.3 We note that Construction 2.1 also provides forthe trade-offs between decoding complexity (A · l) and coderates. For a GTB code based on a (nq, kq, dq)q = (kq +m−1, kq,m)q RS code, if kq > 2, then this GTB code will have(N,K,D)Q = (q

    kq , qkq−(kq+m−1)q+kq+m−2, 2m+2)Q.This GTB code’s A · l is larger than what has been given inTheorem 3.2. However, its code rate will be better.

    IV. ENCODINGFor encoding of GTB codes one needs to transform the

    check matrix M into the row canonical form and then theredundant digits can be encoded with bitwise XOR operations.Similar to its decoding, GTB codes encoding requires no finitefield computations.

    Definition 4.1: A matrix M is said to be in row canonicalform or Reduced Row Echelon Form (RREF) [20] if thefollowing conditions hold :

    ◦ All zero rows, if any, are at the bottom of the matrix;◦ Each first nonzero entry in a row is to the right of the

    first nonzero entry in the preceding row;

    ◦ Each pivot (the first nonzero entry) is equal to 1;◦ Each pivot is the only nonzero entry in its column.Corollary 4.1: If an A×N matrix M is a binary check ma-

    trix generated from an m-superimposed code for a (N,K,D)QGTB code V , then it can be transformed to a row canonicalform M ′, with A −m non-zero rows. In M ′ all the A −mcolumns with the pivots represent the locations of redundantdigits, and the remaining N −A+m columns the informationdigits.

    Example 4.1: A GTB code V over GF (Q), Q = 23, hasN = 9 information digits and is able to correct single-digiterrors. According to Theorem 3.1 its check matrix M can beconstructed from the (nq, kq, dq)q = (2, 2, 1)3 Reed-Solomoncode. Then code V has the parameters of (N,K,D)Q =(9, 4, 4)23 and for this code:

    M =

    ∣∣∣∣∣∣∣∣∣∣∣

    1 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1

    ∣∣∣∣∣∣∣∣∣∣∣By Corollary 4.1, after transforming M into row canonical

    form we have:

    M ′ =

    ∣∣∣∣∣∣∣∣∣∣∣

    1 0 0 0 1 1 0 1 10 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 10 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 10 0 0 0 0 0 0 0 0

    ∣∣∣∣∣∣∣∣∣∣∣M ′ indicates that in a codeword (message) v =

    (v1, v2, v3, v4, v5, v6, v7, v8, v9), the redundant digits arev1, v2, v3, v4, v7, and the information digits are v5, v6, v8, v9.

    If v5 = (011), v6 = (101), v8 = (110), v9 = (111), thenthe codeword can be encoded by M ′ as:

    v1 = v5 ⊕ v6 ⊕ v8 ⊕ v9 = (111);v2 = v5 ⊕ v8 = (101); v3 = v6 ⊕ v9 = (010);v4 = v5 ⊕ v6 = (110); v7 = v8 ⊕ v9 = (001);

    And so v = (111, 101, 010, 110, 011, 101, 001, 110, 111).

    V. DECODING: ERROR LOCATING AND CORRECTIONThe decoding procedure of GTB codes consists of three

    parts: syndrome computation, error locating, and error correc-tion.

    A. Syndrome Computation for GTB CodesDefinition 5.1: For a GTB code V = {v|M · v = 0} over

    GF (QN ), where M is an A×N binary matrix, if a codewordv = (v1, v2, ..., vN ) is distorted by an error e to ṽ = v ⊕ e,ṽ, v, e ∈ GF (Q), then the syndrome:

    S =M · ṽ =M · e.There are A digits in S = (S(1), S(2), · · · , S(A)), and

    S(i) ∈ GF (Q). Then the support of syndrome Ssup =(Ssup(1), Ssup(2), · · · , Ssup(A)) is defined as:

    Ssup(i) =

    {0, S(i) = 0;1, S(i) 6= 0.

    The A-bit binary syndrome Ssup is used for error locating,and the A-digit Q-ary syndrome S for error correction.

  • B. Error LocatingThe GTB codes’ m-error locating algorithm is generated

    by the following theorem:Theorem 5.1: Let the columns of an A × N bi-

    nary matrix M be the set of all codewords of an m-superimposed code constructed by Construction 2.1 froma (nq, kq, dd)q Reed-Solomon code Cq . Let Ssup =(Ssup(1), Ssup(1), · · · , Ssup(A)) be the A-bit binary vectorrepresenting the support of syndrome S =M · ṽ =M · e, andṽ = (ṽ1, ṽ2, ..., ṽN ), ṽj = vj ⊕ ej . Let u = (u1, u2, ..., uN ) bethe N -bit error locating vector for GTB codes such that:

    uj =

    ∑{i|Mi,j=1}

    Ssup(i) = m+ 1

    ?1 : 0.If uj = 1, then ṽj = vj ⊕ ej , and |ej | 6= 0.Note: through this paper, c = (a = b)?1 : 0 denotes:

    c =

    {1, if a = b;0, otherwise.

    Proof :Construction 2.1 indicates that for any column j in M ,

    there are exactly nq = m + 1 ones in M∗,j . Therefore if theerror ej affects all the nq bits of Ssup where vj participatesin computation, then reversely the location of ej can be foundby summing up all the affected support of the syndromes andcomparing it with m+ 1. �

    We note that Theorem 5.1 does not provide for locationsof all m errors. However it will be shown in Section VI thatthe fraction of missed errors by this procedure is very smallfor a large Q. We will slightly modify this procedure to 100%error correction probability for important practical cases ofm = 1 and m = 2 in Section VII. In Section VI, we will alsogeneralize this procedure from Theorem 5.1 for any m anddevelop a class of 1-step threshold decoding, which still havedecoding complexities linear with respect to the code lengthN . This 1-step threshold error locating procedure provides fora trade-off between the number of redundant digits R = N−Kin a codeword and probabilities of missing an error. We willsee that by increasing R by at most a factor of 2, we canguarantee that probability of missing an error will be zero.

    We also note that this error locating procedure from The-orem 5.1 never wrongly identifies a non-distorted digit in acodeword as a digit containing error.

    Remark 5.1 For a class of error locating Q − ary codeswith row density q−1 in the binary check matrix (the faction of1’s in a row), denote the number of total error locations as Ne,the minimum number of bits in Ssup as R, and the probabilityof one or more errors revealed in one non-zero syndrome bitas p. For example, for m = 1, p = qq2+1 ≈ q

    −1, and for

    m = 2, p = q+(q2−q)q

    1+q2+(q2

    2 )≈ 2q+1 . Then we have the following

    lower bound for R for any Q− ary code:

    R ≥ dlog2NeeH(p)

    ,

    where H() is the binary entropy function.The closer R of an ECC code is to the theoretical bound

    RA =dlog2Nee

    H(p) , the better the code is.

    Fig. 2: The comparison between the actual GTB code’s redundancyR1 and the theoretical lower bound RA1 for m = 1, and betweenR2 and RA2 for m = 2. For m = 1, the ratio of the lower boundand actual R of GTB codes is converging to 1 as q is growing.

    C. Error CorrectionFor any located m-error, it always can be corrected by the

    algorithm presented by the following theorem.Theorem 5.2: Let a codeword v over GF (Q) be distorted

    by an m-digit error to ṽ = v ⊕ e. e is located by the errorlocating vector u = (u1, u2, ..., uN ), where if uj = 1, |ej | 6= 0.Also let S be the A-digit syndrome where S =M · ṽ =M · eand S(i) = (S(1), S(2), ..., S(A)), S(i) ∈ GF (Q). Then forany non-zero error digit ej in e, there must exist at least onerow Mi,∗ in M , such that∑

    {j|Mi,j=1}

    uj = 1.

    Then S(i) = Mi,∗ · ṽ = Mi,∗ · e = ej . And so ṽj can becorrected by:

    vj = ṽj ⊕ ej = ṽj ⊕ S(i).

    Proof :According to Definition 2.1, in a matrix M whose columns

    are codewords of an m-superimposed code, within any set Tof columns, |T | ≤ m + 1, for any column h, h ∈ T , theremust exist a row k in M , where Mk,h = 1 in column h,and Mk,j = 0 for all j ∈ T, j 6= h. Since this is true forall columns in T , there exists an (m+ 1)× (m+ 1) identitysub-matrix in any given m+ 1 columns.

    The column indices of m errors are given by u as inTheorem 5.1. And the rows Mi,∗ for the identity sub-matrix canbe easily identified by checking if only one error participatesin the computing of S(i).

    Meaning for any one out of m errors, where T = {j | ej 6=0} is the set of error locations, there exits at least one row Mi,∗,where only Mi,j = 1 and Mi,h = 0 for all h ∈ T, h 6= j.

    This m × m identity sub-matrix will provide the indicesof the digits in syndrome S which are affected by each singleerror digit ej only, such that:

    S(i) =Mi,∗ · ṽ =Mi,∗ · e = ej .

    Therefore vj = ṽj ⊕ ej = ṽj ⊕ S(i). �To summarize, the decoding procedure consists of the

    following steps: calculating the syndrome, converting it to thesupport of the syndrome, error locating, finding the m × m

  • identity sub-matrix corresponding to the m-digit error, errorcorrection.

    Example 5.1: A Q-ary GTB code has Q = 23 = 8 andparameters (N,K,D)Q = (9, 2, 6)8. The distorted codewordṽ = (1, 2, 3, 6, 6, 2, 2, 3, 1)8.

    Since D = 6, we can do double-error correction with thiscode. Firstly by Theorem 3.1 we have:

    q =√N = 3; dq = m =

    D − 22

    = 2;

    kq = 2; nq = m+ 1 = 3.

    Thus the check matrix can be constructed by a(nq, kq, dq)q = (3, 2, 2)3 RS code:

    M =

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

    1 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 11 0 0 0 0 1 0 1 00 0 1 0 1 0 1 0 00 1 0 1 0 0 0 0 1

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣The syndrome and the support of the syndrome are:

    S =M · ṽ = (0, 2, 0, 5, 7, 0, 0, 7, 5);

    Ssup = (0, 1, 0, 1, 1, 0, 0, 1, 1).

    By Theorem 5.1, the double-error is located as:

    u = (0, 0, 0, 1, 1, 0, 0, 0, 0).

    Knowing that e = (0, 0, 0, e4, e5, 0, 0, 0, 0), (e4 6= 0, e5 6=0), by Theorem 5.2 the identity sub-matrix corresponding toe4 and e5 is: ∣∣∣∣M8,4 M8,5M9,4 M9,5

    ∣∣∣∣ = ∣∣∣∣0 11 0∣∣∣∣ .

    And so:v4 = ṽ4 ⊕ S(8) = 3;

    v5 = ṽ4 ⊕ S(9) = 1;

    v = (1, 2, 3, 3, 1, 2, 2, 3, 1)8. �

    VI. GENERALIZED 1-STEP THRESHOLD DECODINGThe error correcting algorithm introduced in the previous

    section requires that for any ej ∈ e, ej 6= 0, there are nq digitsof the syndrome affected by it. However, if there exists one ormore than one row Mi,∗ in M such that S(i) = Mi,∗ · ṽ =Mi,∗ · e = ej ⊕ ek⊕ · · ·⊕ ez = 0, and ej , ek, · · · , ez ∈ e, thenuj = 0 and so ej cannot be located or corrected. We will referto this case as error masking in position i.

    For a Q-ary GTB code, Q = 2b, it is obvious that the upperbound on the probability of having at least one error maskingis Q−1. Also for any ej in an m-digit error e, it will at mostaffect nq = m+ 1 digits of the syndrome, out of which therecan be at most m− 1 error maskings since λ = 1. Therefore,we have the probability Pcorr of no error masking for ej 6= 0lower bounded by:

    Pcorr ≥ (1−Q−1)m−1. (9)

    Under the same m, the larger the Q or b = logQ is,the greater the Pcorr is (see Fig. 2). It is expected that asm grows, the error correcting probability decreases. However,if b is large enough, (9) can still provide a satisfying Pcorr.In Fig. 2 without loss of generality, we take a (N,K,D)Q =(625, 360, 22)Q GTB code. 1 ≤ m ≤ 10, and 4 ≤ b ≤ 32.

    Fig. 3: The probability Pcorr of any given digit for error correctionin (625, 360, 22)Q GTB code.

    In Fig. 3 when b ≥ 16, for any m the curve of Pcorr alwaysstays very close to 100%. Meaning for a GTB codewordconsisting of 16-bit digits or larger, it can always locate andcorrect up to m errors with probability nearly 100%.

    Moreover, Pcorr can be increased by increasing the redun-dancy of the GTB code.

    Theorem 6.1 Let CRS be a (nq, kq, dq)q = (m + 1 +4, 2,m+4)q Reed-Solomon code and M is the check matrixof a GTB code V4 of (N,K,D)Q = (q2, q2−q(m+1+4)+m+4, 2(m+4) + 2)Q, then for V4 we have:

    Pcorr ≥ (1−Q−1)m−1−4. (10)

    And:Pcorr → 1 when 4→ (m− 1).

    And so the error locating vector u = (u1, u2, · · · , uN ) canbe re-written as:

    uj =

    ∑{i|Mi,j=1}

    Ssup(i) ≥ m+ 1

    ?1 : 0.If uj = 1, then ṽj = vj ⊕ ej , and ej 6= 0.Proof :If the number of digits in CRS codewords increases by

    4, meaning the number of blocks in M increasing to n′q =nq + 4, then the number of blocks without error maskingwill be increased by 4. So the number of digits without errormasking in the syndrome will increase by 4. In this way thelower bound in (9) can be re-written as:

    Pcorr ≥ (1−Q−1)m−1−4.

    If 4 = 0, then (10) is equivalent to (9).If 4 = m−1, then Pcorr = 1. Meaning for any ej , among

    2m digits of the syndrome it affects, there are always at leastnq = m+1 digits indicating that it is an error. This will be thesame as majority voting. Meaning all m errors can be locatedand corrected with exactly the probability of 100%.

    Since (∑Ssup(i) ≥ m+ 1)?1 : 0 can be simply repre-

    sented as a threshold gate, where m+ 1 or more ones in the

  • input produce a binary 1 in the output, the procedure presentedin Theorem 6.1 can be called 1-step threshold decoding. �

    As an example, by Theorem 6.1 we can improve the errorcorrecting probability in Fig. 2. Taking m = 10 as an example,if 0 ≤ 4 ≤ 9, the updated data are graphed in Fig. 4.

    Fig. 4: The probability of successfully correcting 10 errors when 4increases.

    It can be found that under different b, the Pcorr goes to 1in different velocity. When 4 = m − 1, no matter what b is,the Pcorr is always 100%. This result matches Theorem 6.1.

    However, as 4 increases, the code rate decreases. Forinstance, when 4 = m − 1 = 9 and Pcorr = 1, the original(N,K,D)Q = (625, 360, 22)Q code becomes (N,K,D)Q =(625, 144, 40)Q.

    Example 6.1 By (9), a (N,K,D)Q = (25, 12, 6)28 GTBcode V0 is able to correct double errors at a probability of99.6%. The columns of the check matrix are generated by allthe codewords of a (nq, kq, dq)q = (3, 2, 2)5 RS code:∣∣∣∣∣ 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 40 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 40 4 3 2 1 4 3 2 1 0 3 2 1 0 4 2 1 0 4 3 1 0 4 3 2

    ∣∣∣∣∣To make it capable of correcting all double errors at a

    probability of 100%, we select 4 = m − 1 = 1 and sothe new matrix will be generated from the codewords of a(nq, kq, dq)q = (4, 2, 3)5 RS code,:∣∣∣∣∣∣∣0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 40 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 40 1 2 3 4 2 3 4 0 1 4 0 1 2 3 1 2 3 4 0 3 4 0 1 20 3 1 4 2 2 0 3 1 4 4 2 0 3 1 1 4 2 0 3 3 1 4 2 0

    ∣∣∣∣∣∣∣By substituting 0, 1, 2, 3, 4 with (10000)TR, (01000)TR,

    (00100)TR, (00010)TR, (00001)TR, the check matrix M canbe constructed. And so the parameters of the new GTB code V1will be a (N,K,D)Q = (25, 8, 10)28 code and it can correctall double errors with probability 100% by the 1-step thresholddecoding introduced in Theorem 6.1. �

    For 1-step threshold decoding, if m = 1, then by (9)we always have Pcorr = 1. When m ≥ 2 by increasing atmost m − 1 blocks we can achieve Pcorr = 1 at the cost ofdecreasing the code rate. For the m = 2 case, the code para-meters will change from (N,K,D)Q = (q2, q2 − 3q + 2, 6)Qto (q2, q2 − 4q + 3, 8)Q.

    In the following section, we will introduce a procedure toachieve Pcorr = 1 for m = 2 GTB codes without reducingthe code rate. And for the rest of the paper we always assume4 = 0.

    VII. SINGLE AND DOUBLE-BYTE ERROR CORRECTINGGTB CODES

    Single and double-errors are most commonly seen in errorcorrections. The GTB codes can always achieve 100% errorcorrecting probability regardless of b.

    A. Single-Byte Error CorrectionAccording to Construction 2.1, single-byte error correcting

    GTB codes can be generated from (nq, kq, dq)q = (2, 2, 1)qReed-Solomon codes.

    This will result in (N,K,D)Q = (q2, q2−2q+1, 4)Q GTBcodes for m = 1 with:

    A · l = 2q2 = 2N.

    In this case, q is not necessarily to be a power of prime.A check matrix of this code with q = 3 is given in Example4.1.

    Another way to generate single-error correcting GTB codesis to construct it based on a binary Hamming check matrix.

    Definition 7.1 A Q-ary GTB code Y with following para-meters (N,K,D)Q = (N,N − dlog2(N + 1)e, 3)Q is definedby y ∈ Y if y = {y | M · y = 0}, where M is a binaryHamming check matrix. If a codeword y is distorted by asing-error to ỹ = y ⊕ e and the syndrome S = M · ỹ, thenthe support of the syndrome Ssup is the error location, ande = S(i), if S(i) 6= 0.

    From the above definition it is obvious that this Hammingbased GTB code Y has better code rate, namely smallerredundancy (R = log2(N + 1)) than that of Reed-Solomonbased GTB code V (R = 2

    √N − 1).

    However, code Y has larger decoding complexity by:

    A · l = (N + 1)2

    log2(N + 1);

    while for the GTB code V has:

    A · l = 2N.

    B. Double-Byte Error CorrectionDouble-byte errors usually cost much more time and space

    to be located and corrected than single-byte errors. However,for GTB codes it is still very time and cost efficient to correctdouble errors.

    An example of correcting double-error has been given inExample 5.1. However, if there is an error masking, meaninge4 = e5, then u4 = u5 = 0. In this case the double-errorcannot be guaranteed to be located and corrected. Thereforewe propose the theorem below to achieve 100% error locatingand correction for m = 2.

    Theorem 7.1: Let the columns of matrix M in size A×Nbe the set of all non-zero codewords of a 2-superimposedcode constructed by Construction 2.1 from a (nq, kq, dd)qReed-Solomon code Cq . Let Ssup be the A-bit binary vectorrepresenting the support of syndrome S = M · ṽ = M · e,where e is a double-error and causes one error masking in S.Let w = (w1, w2, ..., wN ) be the error locating vector for GTBcodes such that:

    wj =∑

    {i|Mi,j=1}

    Ssup(i).

    If wj = m+ 1 = 3, then ej 6= 0.If there is no wj = m + 1 = 3, there will be some j

    such that wj = m = 2 which indicate possible error locations.

  • Denote W = {wj | wj = m} as the set of possible errorlocations. If there is a row Mi,∗ such that:

    (wj = 2) ∧ (S(i) = 0) = 1;

    Then ej = 0, and the remaining items in set W are errorlocations.

    Proof :When m = 2, from (7) we know λ = 1. So there can be

    at most one error masking in syndrome S. Therefore if:

    wj =∑

    {i|Mi,j=1}

    Ssup(i) = m;

    Then it is possible that ej 6= 0.However, there can be more than two uj = 1. From

    Definition 2.2, the bitwise OR of any 2 columns cannot coveranother column. Therefore for any wj = m = 2, there mustexit a Row Mi,∗ such that:∑

    {j|Mi,j=1}

    wj = 1.

    However, since for double error correcting GTB codes D =2m+2 = 6, any two syndromes of two different double errorsmust be different. Therefore for this Row Mi,∗, if:

    S(i) = 0;

    Then in vj there cannot be an error. Because if there is anerror in vj , and

    ∑wj = 1, then S(i) = ej 6= 0. �

    Example 7.1: A GTB code V with the same parameters andthe same legal codeword v as Example 5.1. It is now distortedby a double-error at digit 4 and 5 that causes error masking:

    e = (0, 0, 0, 7, 7, 0, 0, 0, 0).

    Then:

    S =M · ṽ = (0, 0, 0, 7, 7, 0, 0, 7, 7);

    Ssup = (0, 0, 0, 1, 1, 0, 0, 1, 1);

    w = (1, 2, 1, 2, 2, 1, 2, 1, 1);

    W = {w2, w4, w5, w7.}

    The sub-matrix of M consisting of columns indexed by Wand the syndrome vector are:∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

    1 0 0 00 1 1 00 0 0 10 1 0 11 0 1 00 0 0 00 0 0 00 0 1 11 1 0 0

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣S =

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

    000770077

    ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣It is not hard to find out that for M1,∗, (w2 = 2)∧(S(1) =

    0) = 1, and for M3,∗, (w7 = 2) ∧ (S(3) = 0) = 1. Thene2 = e7 = 0.

    Therefore the double-error is e = (0, 0, 0, e4, e5, 0, 0, 0, 0)where e4 = 7 and e5 = 7. Similarly to Example 5.1, they canbe corrected by Theorem 5.2. �

    VIII. CODE RATES COMPARISONCode rate of K/N is a metric often used for comparing

    different codes.By Theorem 3.1 the parameters of GTB codes are:

    qGTB =√N ;

    RGTB = (m+ 1)qGTB −m.

    For interleaved Orthogonal Latin Square Codes (OLSCs):

    qOLSC =√K;

    ROLSC = 2mqOLSC .

    And for Reed-Solomon codes:

    RRS = 2m.

    It is obvious that RS codes always have the best coderate. GTB codes should have better code rate than interleavedOLSCs. Without loss of generality, for example, if both codeshave the same length N = 625 and 2 ≤ m ≤ 11, as mchanges, the code rates are:

    Fig. 5: Code rate comparison with the same N and m.

    As expected, RS codes as MDS codes always have the bestcode rate over all others. When m is relatively small, the coderate of GTB and OLSCs codes are similar. As m grows larger,GTB codes have better and better code rate than OLSCs. Thelarger m is, the better GTB codes’ rates are than that of theinterleaved OLSCs. When m = 2, GTB is 88.3% and OLSCis 82.7%. When m = 11, GTB codes still have the rate of54% and OLSCs only 5%, which is 10 times less.

    IX. ERROR DETECTION AND CORRECTION PROBABILITYCOMPARISON

    For an m-error correction or 2m-error detection ECC, it isknown that its error correction and detection probability are notlimited by m or 2m correspondingly. They are also determinedby the weight distribution and uniqueness of the syndromes.

    A. Error Detection Probability ComparisonDenoting p as the probability of a digit distorted to another,

    the number of codewords with Hamming weight i as Ai,the more precise error detection probability using weightdistribution of codewords is [21]:

    Pdet = 1−N∑i=1

    Aipi(1− (Q− 1)p)N−i

  • Under the same m, it is fair to compare among codes ofsimilar length. If the code length differs largely, such as GTBto Reed-Solomon, then the code with larger length will bemuch better due to its weight distribution. Therefore under thesame N , m, and p, we select GTB and OLSCs.

    Since the bit distortion rate p is usually close to 0, both oftheir error detection probabilities are close to 100%. To makethe comparison more obvious, we will graph the improvementbetween the error detection probability PGTB of GTB codesand error detection probability POLSC of OLSC codes:

    Improvement =PGTB − POLSC

    POLSC.

    The following chart shows an example of the differencesin error detection under various p and Q.

    Fig. 6: Error detection improvement of GTB codes over OLSCsunder the same lenght. Here N = 32, m = 2, p decreases from10−1 to 10−6. For every p the three bars are respectively Q = 2,Q = 4, and Q = 8.

    As expected, the larger the bit distortion rate p is, thelarger the error detection improvement. When p = 10−1, GTBcodes’ error detection probability is as much as over 80% morethan OLSCs. Even as p decreases to nearly 0, and both errordetection probabilities increase to almost 100%, GTB codesstill performs better.

    The following chart shows the differences in error detectionunder various p and Q:

    Fig. 7: A zoom-in of Fig. 6 under a certain p. As Q increases, theimprovement of GTB codes’ error detection over OLSC’s increases.

    The previous results all have shown that GTB codes havemore advantage as Q increases. It is again demonstrated inthe error detection probability. The character of above figureappears under all different p.

    B. m+ 1 Error Correction Probability ComparisonFor a code with distance D = 2m + 1 or D = 2m + 2,

    it is able to correct all m-digit errors. However, it usually isalso capable of correcting more than m errors with a certainprobability. Especially for an m-error correcting code, it isvery likely to be able to correct most m+ 1 errors.

    As long as the syndromes S = M · ṽ of different errorpatterns are different, it is possible to uniquely correct theerrors.

    Taking the same parameters as in sub-section A, when bothGTB and OLSC have the same N and m, the probabilityof correcting m + 1 = 3 errors are all over 90%. To makethe comparison more obvious, the error missing probability isgraphed:

    Pmiss = 1−number of unique syndromesnumber of total syndromes

    Fig. 8: Error missing probabilities of tripple errors for m = 2 errorcorrecting GTB and OLSC codes. GTB codes always miss less tripleerrors than OLSC codes.

    X. HARDWARE IMPLEMENTATIONIn Section V the algorithms of error locating and correction

    are introduced in Theorem 5.1 and Theorem 5.2. The decoderconsists of five components: syndrome computation, supportof syndrome conversion, error locating, finding the magnitude,and error correction. Since all the five components are combi-national networks, their latency altogether is negligible.

    A. GTB Codes’ Decoding ComplexityIn Theorem 5.1 and Theorem 5.2 there is no complicated

    logic involved. The complexity of the hardware can be esti-mated in the number of equivalent 2-input gates:

    1) Syndrome computation:

    S =M · ṽ =M · e

    Hardware cost:bA(l − 1)

    2) Support of syndrome conversion:

    Ssup(i) =

    {0, S(i) = 0;1, S(i) 6= 0.

    Hardware cost:A(b− 1)

  • 3) Error locating:

    uj =

    ∑{i|Mi,j=1}

    Ssup(i) = m+ 1

    ?1 : 0;Hardware cost:

    mN

    4) Finding the identity matrix:

    Row(i) =

    ∑{j|Mi,j=1}

    uj = 1

    ?1 : 0;Hardware cost:

    A(2.5l + log l − 1)

    5) Error correction:

    ej =∨

    {i|Mi,j=1}

    Row(i) · S(i).

    (∨

    is the bitwise OR for all elements in its subscript.)Hardware cost:

    Nb(m+ 1) + b(A+N)

    Adding all the five hardware cost estimation together anddenoting the decoding complexity as L =:

    L = bA(l − 1) +A(b− 1) +mN+A(2.5l + log l − 1) +Nb(m+ 1) + b(A+N)

    ≈ b(Al +mN)

    Since Al ≈ mN :

    L ≈ b(Al +mN) ≈ 2mNb. (11)

    Thus from (11) it follows that the decoding complexityof GTB codes is linear to its codeword length N , digit sizeb = logQ, and the number of errors to be corrected m.

    The schematics of this decoding system is as the following:

    Fig. 9: The system has 5 components: computing the syndrome;converting the support of the syndrome; error locating; fining theidentity matrix; error correction. The bit width of each bus is labeled.

    All 5 components are simple circuits: bitwise XOR gates,bitwise OR gates, nq-bit and m-bit adders. Comparing withother non-binary error correcting codes which require finitefield multipliers or even inverters, it is much more cost-efficient.

    Moreover, the circuit in Fig. 9 is a combinational network.It takes almost no time in decoding. In contrast, many otherpopular ECCs require decoders working under a relativelylarge latency which is proportional to the codeword length Nand number of errors to be corrected m.

    B. Decoding Complexity ComparisonAs stated above, GTB codes’ decoder is much more cost-

    efficient than other codes. To verify it by experiments, weimplemented the decoder in Fig. 8 and compared it with otherdecoders. In this sub-section, the decoding complexity L isdefined as:

    L = hardware (in 2-input gates)× latency (in clock cycles).We will discuss two most important cases: m = 1 and

    m = 2. Single and double errors are also most commonlyseen in hardware distortions.

    Since we focus on the application for reliable memories,here we will use GTB codes to protect 512 bits of data as itis a common vector size in processors and memories such asIntel Xeon Phi, Nvidia GTX280, and AMD Radeon R9 [22].

    1) Single-Byte Error Decoding Complexity Comparison:The major competitors of GTB codes when m = 1 arenon-binary Hamming codes, Reed-Solomon (RS) codes andinterleaved binary Hamming codes [23].

    For the experiment of protecting a 512-bit data vector inmemory, we select m = 1, b = 32, K = 16. The decodersof five different codes are implemented on a Xilinx Virtex4XC4VFX60 FPGA board.

    Fig. 10: Hardware cost of five different codes’ decoders for m = 1.The RS based GTB code’s hardware cost is set as 1 for the baseline.

    From the above figure, it is clear that RS based GTB codescost the least in hardware decoder, and then the Hammingbased GTB codes. Other codes all have decoding complexity70% — 150% more than that of GTB codes.

    2) Double-Byte Error Decoding Complexity Comparison:The major competitors of GTB codes when m = 2 are Reed-Solomon codes and interleaved Orthogonal Latin Square codes(OLSC) [24].

    Similar as the previous experiment, we select m = 2,b = 32, K = 16. The decoders of three different codes areimplemented on a Xilinx Virtex4 XC4VFX60 FPGA board.

  • Fig. 11: Hardware cost of five different codes’ decoders for m = 2.The RS based GTB code’s hardware cost is set as 1 for the baseline.

    When m = 2, although other codes may have better coderate, interleaved OLSC costs 5 times more than GTB codes,and RS costs almost 50 times more. As m increases, GTBcodes’ advantage over other codes also increases.

    By both theoretical estimations and practical experiments,as far as we know, the decoding of GTB codes with complexity2mNb has the lowest complexity for (N,K, 2m+2)Q codes.

    XI. CONCLUSIONAs multi-bit upsets which result in multiple byte errors be-

    come more probable with newer and faster memories, strongerprotection against byte-level distortions is highly demanded.Therefore we have introduced this new non-binary grouptesting based byte-level error correcting codes (GTB codes)and its application on single and multi-error correction. Forcodewords with digits in Galois field GF (Q), the proposednew codes’ decoding does not require any multiplications orinversions in Galois fields. The cost is a larger redundancy formuch lower decoding complexity than other known codes suchas Hamming and Reed-Solomon codes. Comparing with codesof low complexity such as bit-interleaved codes, GTB codeshave the advantage of much better code rate. Moreover, GTBcodes are decoded in a much faster speed than other codes dueto its fully combinational network.

    The check matrices of GTB codes are generated frombinary superimposed codes. This enables low-complexity de-coding with mere binary operations. The hardware overheadfor decoding is as low as O(mNb). And as Q = 2b increases,the computation complexity only increases proportionally tob. In contrast, popular codes such as Reed-Solomon codesoperate under a decoding complexity proportional to at leastb2. These characters make GTB codes a promising low-costand high-reliability ECC for the design of reliable memories.

    Based on the GTB codes’ fast and low complexity de-coding, we suggest that it can serve as replacement of thecurrent popular error correcting codes in reliable memorydesigns requiring small latency and low decoding complexity,such as DRAM, SRAMs for cache, EEPROM and Flashes forcryptographic devices [25].

    REFERENCES[1] Z. Wang and M. Karpovsky, “Reliable and secure memories based on

    algebraic manipulation detection codes and robust error correction,”Proc. Int. Depend Symp, 2013.

    [2] E. Fujiwara, Code design for dependable systems: theory and practicalapplications. John Wiley & Sons, pp. 264, 2006.

    [3] G. Umanesan and E. Fujiwara, “A class of codes for correctingsingle spotty byte errors,” IEICE Transactions on Fundamentals ofElectronicsIEEE, vol. 86.3, pp. 704–714, 2003.

    [4] W. Zhen, M. Karpovsky, and K. J. Kulikowski, “Replacing linearhamming codes by robust nonlinear codes results in a reliabilityimprovement of memories,” Dependable Systems & Networks, DSN’09.IEEE/IFIP International Conference, 2009.

    [5] Z. Wang and M. Karpovsky, “New error detecting codes for thedesign of hardware resistant to strong fault injection attacks,” Proc.Int. Conference on Security and management, SAM, 2012.

    [6] Y. Wu, “New list decoding algorithms for reed-solomon and bch codes,”Information Theory, 2007. ISIT 2007. IEEE International Symposium,2007.

    [7] J. Jeng and T. Truong, “On decoding of both errors and erasures of areed-solomon code using an inverse-free berlekamp-massey algorithm,”IEEE Transactions on Communications, no. 47.10, pp. 1488–1494,1999.

    [8] Xilinx, LogiCORE IP Reed-Solomon Decoder, v8.0, ds862 ed., October19, 2011.

    [9] Y. Cui and X. Zhang, “Research and implemention of interleavinggrouping hamming code algorithm,” Communication and Computing(ICSPCC), 2013 IEEE International Conference on Signal Processing,pp. 1–4, 2013.

    [10] S. Laendner and O. Milenkovic, “Ldpc codes based on latin squares:Cycle structure, stopping set, and trapping set analysis,” IEEE Trans-actions on Communications, no. 55.2, pp. 303–312, 2007.

    [11] A. G. D’yachkov, A. J. Macula, and V. V. Rykov, “On optimalparameters of a class of superimposed codes and designs,” IEEEInternational Symposium on Information Theory, 1998.

    [12] A. M. A. G. Dyachkov and V. Rykov, “New applications and results ofsuperimposed code theory arising from the potentialities of molecularbiology,” Numbers, Information and Complexity, pp. 265–282, 2000.

    [13] W. Kautz and R. Singleton, “Nonrandom binary superimposed codes,”IEEE Transactions on Information Theory, no. 10.4, pp. 363–377, 1964.

    [14] A. G. D’yachkov and V. V. Rykov, “Bounds for the length of disjunctivecodes,” Problems of Information Transmission, vol. 18, no. 3, pp. 7–13,1982.

    [15] P. Luo, A. Lin, W. Zhen, and M. Karpovsky, “Hardware implementationof secure shamir’s secret sharing scheme,” High-Assurance SystemsEngineering (HASE), IEEE 15th International Symposium on., 2014.

    [16] A. G. D’yachkov and V. V. Rykov, “Optimal superimposed codes anddesigns for renyi’s search model,” Journal of Statistical Planning andInference, no. 100.2, pp. 281–302, 2002.

    [17] S. Ling and C. Xing, Coding theory: a first course. CambridgeUniversity Press, 2004.

    [18] Z. Wang, M. G. Karpovsky, and L. Bu, “Design of reliable and securedevices realizing shamirs secret sharing,” IEEE Tansactions on onComputers, vol. PP, Oct. 2015.

    [19] L. Bu, M. G. Karpovsky, and Z. Wang, “New byte error correctingcodes with simple decoding for reliable cache design,” 21st IEEE On-Line Testing Symposium (IOLTS), 2015.

    [20] C. D. Meyer, Matrix analysis and applied linear algebra. Siam, 2000.[21] J. C. Moreira and P. G. Farrell, Essentials of error-control coding. John

    Wiley & Sons, 2006.[22] A. Fog, The microarchitecture of Intel, AMD and VIA CPUs — An

    optimization guide for assembly programmers and compiler makers,Technical University of Denmark, 2014.

    [23] Y. Cui and X. Zhang, “Research and implemention of interleavinggrouping hamming code algorithm,” IEEE International Conference onSignal Processing, Communication and Computing (ICSPCC), 2013.

    [24] G. Yalcin and et al, “Exploiting a fast and simple ecc for scaling supplyvoltage in level-1 caches,” IEEE On-Line Testing Symposium (IOLTS),2014.

    [25] S. Ge, Z. Wang, P. Luo, and M. Karpovsky, “Secure memories resistantto both random errors and fault injection attacks using nonlinear errorcorrection codes,” ACM Proceedings of the 2nd International Workshopon Hardware and Architectural Support for Security and Privacy, 2013.