Final Paper for Publication

8/13/2019 Final Paper for Publication

http://slidepdf.com/reader/full/final-paper-for-publication 1/5

A Novel Implementation of CRC Algorithm In XMODEM Protocol

on FPGA Using VHDL

[1]A.Bhargav,

[2]J.E.N.Abhilash

[3] N.Srikanth

Dept. of ECE, SCET Associate, Professor. Associate, Professor. Dept. of ECE, SCET, Narsapur, Dept. of ECE, SCET, Narsapur

Abstract — X modem protocol requires aCRC to ensure the data correct. There are many

implementations of the CRC algorithm. In this paper, CRC algorithm is implemented in Xmodem protocol. The implementation of 8-bit parallelCCITT CRC16 reduces the number of clock cyclesrequired to generate the CRC. This implementationis for single- byte and multi byte inputs. CRC for

1024 bits of data can be achieved in 128 clockcycles using VHDL language.

Keywords- L FSR; CRC parall el computation;

Xmodem protocol; VHDL

I. INTRODUCTION

Xmodem protocol is a widely usedasynchronous file transfer protocol. The standardmainly contains two components: Xmodem and1k-Xmodem. Xmodem transmit data by128-byte block form and 1k-Xmodem byte block is 1k 1024 bytes, both standards all support checksum and

CRC verification methods. They all support ofretransmission (generally support 10 times) if thetransmitting data is mistake. Xmodem CRCchecksum requires 128 bytes (or 1024 bytes) packets as a whole. When the receiver receives adata packet, then it send back a confirmationcharacter if the checksum is correct while send back a negative character in order to waiting forretransmission if it is error. The efficiency ofXmodem protocol data transformation is directlyimpacted by the packet check time.

The commonly used data transmission errorchecking method is parity, cyclic redundancy

check code (abbreviated as CRC) and so on. Parityand CRC verification is generally applicable to a byte or a word, while CRC is also suitable for packets of data validation. CRC is essential in themass data transformation application, such asXmodem protocol, HEX file, RFID protocol, USBcommunication protocol, etc.

II. CRC ALGORITHM‟S PRINCIPLE

First, the basic idea of CRC uses the lineartheory, left move r bits of the k-bit binary number,which is need to be transmitted, replenish the right

Vacated r bits with zeros and the position of zerosare the location of the CRC. The above k + r bitsof data complement to the generator polynomialand the remainder is the CRC.

The original k-bit binary number and the r-bitCRC are all sent out when transmission. Thereceived k + r bits of data complement to thegenerator polynomial and if the remainder is zero,

indicating the transmission data is correct, or elseit is incorrect. The following is the specify CRCimplementation principle.

The k-bit binary number is about to be

transmission, using M (X) said,

M ( x) = C k - 1 Xk - 1

+ C k - 2 Xk - 2

+… + C i Xi+

… + C 1 X + C 0 (1)

Left move the sequence data r-bit, that ismultiplied by X r , where r is the highest power of

the generator polynomial G (x).

M ( x) • x r = C k – 1 X

k + r - 1

+ C k - 2 Xk + r - 2

+ … +

C 1 Xk + 1

+ C 0 Xk

(2)

The results of M (x) • xr modulo 2 divided

by the generator polynomial G (x).

= Q(x) + (3)

Finally the remainder R (x) is the CRC.

Remainder polynomial R (x) can be expressed as:

R( x) = d r-1 xr-1

+…+d 1 x1+r 0 (4)

The final data need to be transmitted:

M ’= (C k - 1 , … , C 0 , d r - 1 , … , d 1 , d 0 ) (5)

From M to M ' is the process of the CRCcoding. We can use the typical LFSR (linearfeedback shift register) to complete the hardware

circuit based on the above formula. As shown inthe Fig 1.



Fig 1. Linear Feedback Shift Register LFSR

Fig 1 is a typical circuit, if the generator

polynomial G(x) is 1, the output of the D flip-flop connects to the output of the XOR gate; if itis 0, and the output of the D flip-flop connects tothe output of the superior flip-flop. Therefore,

this figure can be greatly simplified in the case ofa fixed generator polynomial.

There are some standard generator

polynomials in the practical applications, as

follows:

CRC8: The polynomial is X

8

+X

5

+X

4

+1, and the corresponding number is 0x131;

CRC12: The polynomial is X 12+X 11+X 3+X2+1, and the corresponding figure is 0x180D;

CCITT CRC16: The polynomial is X 16 +X12+X 5+1, and the corresponding number is

0x11021;

ANSI CRC16: The polynomial is X 16 +X15

+X 2+1, and the corresponding number is

0x18005;

CRC32: The polynomial is X 32

+X 26

+X23

+X 22

+X 16

+X 12

+X 11

+X 10

+X 8+X

7 +X

5 X

4+X

2+X

1

+1, and the corresponding number is0x104C11DB7.

In the Xmodem protocol, we use the

standard CCIT CRC16 X16

+X12

+ X5+1 as the

generator polynomial and the correspondingnumber is 0x11021. Therefore, the CRC hardwarecircuit can be simplified as shown in Figure- 2

Fig 2. The circuit implementation of CCITT CRC16 LFSR

1) We first clear the flip-flop by CR, and move

the upper 16 bits (2 bytes) which need to beverified into 16 of the trigger. The upper 16 bits ofthe data stream will not be changed as the triggerhas been cleared.

2) Then we continue flow the data into thetrigger, it just need right 1-bit if the output of the

r15 trigger is 0; and it need right 1-bit after themodulo 2 operation if the output is 1.

3) At last we need continuous move into 16-bit0 after the data stream M (x) all into the trigger, andend the calculation of this group data CRC The

above circuit using the general process of themodulo 2 division operation, its biggest drawbackis need continuous input 16 "0" after the datastream M (x) and the CRC of the trigger need to

more 16 times calculation.

EXAMPLE:

An example was shown in figure-3 describing

the operation of 16-bit CRC generation using fig - 2

Fig 3. Hardware Implementation Of 16-Bit CRC

Here the input data bits are 8. As shown in theabove fig-3 m0-m7 indicate the input message bits.

The basic process involves XOR operation at

specified bits depending upon the standardgenerator polynomial we use. In Xmodem protocolwe use CCITT CRC-16 polynomial i.e X 16+X 12+X5+1.

So we have to perform XOR operation at bits16, 12, 5, 1. Here 16 bits are appended to input

message bits. So we have total 8+16 bits as datastream. So we need 24clock cycles to find out theCRC for 8-bit data. Operations of the last 8 clockcycles are shown in the above figure. Final row

indicated the 16 bit CRC.

For every clock cycle the last bit of highernibble is taken as feedback for generation of nextcycle. In between feedback bit get XORed withsome intermediate bits as shown above. Maindrawback of this process is, 24 clock cycles are

needed to find out the CRC of 8 bit data. But in parallel implementation of Xmodem protocol clockcycles reduce to very smaller extends.

III. DESIGN OF CRC PARALLEL

COMPUTATION

In the Xmodem protocol, each packet is 128 bytes (1024 bits). We need 1040 (1024 +16) cycles



to figure out the CRC using the Figure 2. Thisdesign uses a parallel computing and hardware

implementations in order to improve the real-time.

We main narrative the 8-bit CRC parallel

computing. The state of the flip-flop is the

remainder of the CRC as shown in Figure 1. Theremainder of the CRC is just concerned with the

former input and the remainder of the previousstate when the serial operation. The calculation of8-bit parallel operation as follows:

Supposed r ji as the value of the trigger, i = 1,

2... n, as the input code sequence, j = 0, 1, ..., k-1,as the trigger coding,

rji = Gj . rk −1

i−1 ⊕ rj−1i−1+ Gj . rj−1

i−1, j = 0 , rj−1i−1 = 0

(6)

The input data is 8-bit, so the maximum of i is8. We can transitive launch r 0

8 … r 158 by CCITT

CRC16 (the polynomial is G( x) =X16

+X12

+X2+1,

that is, k = 16) and the equation (6).

We need 24-clock to calculate the CRC of the8-bit data. In the first 16-clock, we move the 8-bit

data into the high trigger and the low 8-bit is zeros.And this is the initial moment, we can get the CRCof the 8-bit data after 8 clocks and the input data is

zeros. Then the initial moment of the trigger valuesare:

r 00 - r 70 = 0, r 80, r 90… r 140, r 150 = M0,M1… M7

If r j8 is representing the value after 8 clocks,

we substituted the r j8 into equation (6) and get the

final expression of r j8.

r08 = r15

7 = r135 = r12

4 = r153 ⊕ r11

3 = r142 ⊕ r10

2 =

r120 ⊕ r8

0 = M0 ⊕ M4

r18 = r0

7 = r134 = r12

3 = r112 ⊕ r15

2 = r90 ⊕ r13

0

= M1⊕M5

We can derive from other similar items

r28 = r10

0 ⊕ r140 = M2 ⊕ M6

r38 = r11

0 ⊕ r150 = M3 ⊕ M7

r48 = r12

0 = M4

r58 = r8

0 ⊕ r120 ⊕ r13

0 = M0 ⊕ M4 ⊕ M5

r68 = r9

0 ⊕ r130 ⊕ r14

0 = M1 ⊕ M5 ⊕ M6

r78 = r10

0 ⊕ r140 ⊕ r15

0 = M2 ⊕ M6 ⊕ M7

r88 = r0

0 ⊕ r110 ⊕ r15

0 = M3 ⊕ M7

r98 = r1

0 ⊕ r120 = M4

r108 = r2

0 ⊕ r130 = M5

r118 = r3

0 ⊕ r140 = M6

r128 = r4

0 ⊕ r80 ⊕ r12

0 ⊕ r150 = M0 ⊕ M4 ⊕ M7

r138 = r5

0 ⊕ r90 ⊕ r13

0 = M1 ⊕ M5

r140 = r6

0 ⊕ r100 ⊕ r14

0 = M2 ⊕ M6

r158 = r70 ⊕ r110 ⊕ r150 = M3 ⊕ M7

It is easy to achieve 8-bit parallel CCITTCRC-16 mode computing hardware circuitaccording to the below logic, as shown in Figure 4.

Fig 4. 8-Bit Parallel CCITT CRC16

Hardware Circuit

EXAMPLE:

Let us consider the example seen above.

Input bits are „01000001‟.

Therefore M7 = 0, M6 = 1, M5 = 0, M4 = 0,M3 = 0, M2 = 0, M1 = 0, M0 = 1

Substitute the above values in the givenequations.

D0 = M0 ⊕ M4 = 1 ⊕0 = 1

D1 = M1⊕M5 = 0 ⊕0 = 0

D2 = M2 ⊕ M6 = 0 ⊕1 = 1

D3 = M3 ⊕ M7 = 0 ⊕0 = 0

D4 = M4 = 0 = 0

D5 = M0 ⊕ M4 ⊕ M5= 1 ⊕ 0 ⊕ 0 = 1

D6 = M1 ⊕ M5 ⊕ M6= 0 ⊕ 0 ⊕ 1 = 1

D7 = M2 ⊕ M6 ⊕ M7= 0 ⊕ 1 ⊕ 0 = 1

D8 = M3 ⊕ M7 = 0 ⊕ 0 = 0

D9 = M4 = 0 = 0

D10 = M5 = 0 = 0

D11 = M6 = 1 = 1

D12 = M0 ⊕ M4 ⊕ M7= 1 ⊕ 0 ⊕ 0 = 1

D13 = M1 ⊕ M5 = 0 ⊕ 0 = 0

D14 = M2 ⊕ M6 = 0 ⊕ 1 = 1

D15 = M3 ⊕ M7 = 0 ⊕ 0 = 0

Final 16 bit CRC is0101100011100101. Which is exact as

calculated in the previous example



IV. THE IMPLEMENTATION OF

MULTIBYTE CRC ALGORITHM

The data package of the Xmodem protocol is128-byte; we need 128 (or 1024) the above 8-bit parallel CRC circuit to complete the agreement.

We can analysis the CRC of the multi-byte packet based on the byte because the smallest unit of data package sent is byte. Supposed there are n-byte packet, namely

[D n , D n – 1, D n - 2 ,… D 3, D 2, D 1 ].

The implementation steps summarized asfollows:

(1) Calculate the CRC of the first byte D n.

Take the high 8-bit of the CRC with D n - 1modulo

2 operation and the result is the new D n – 1 ; take

the low 8-bit of the CRC with D n – 2 modulo 2operation and the result is the new D n - 2. And the

packet will be [D n - 1, D n - 2, … , D 2 , D 1].

2) Calculate the CRC of the first byte of this

packet D n-1.Take the high 8-bit of the CRC with D

n – 2 modulo 2 operation and the result is the new D n

- 2 ; take the low 8-bit of the CRC with D n - 3 modulo 2 operation and the result is the new D n - 3.And the packet will be [D n - 2 , D n - 3 , … , D 2 , D 1 ].

(3) And so on, until the only two bytes [D 2 , D

1], then make up two-byte word 0 to this two bytes,

that is calculate the remainder of [D 2 , D 1 , 0 , 0 ] by the above method. And the last remaining two bytes [C 2 , C 1] is the final CRC.

Fig-5 gives a clear idea regarding the multibyteCRC implementation.

Fig 5. Multibyte Implementation Of CRC Algorithm ByParallel Computation

B1 & B2 are 1st and 2

nd bytes of input data. A1

& A2 are newly appended bits. C1, C2, C3, C4represents the output of 16 bit-CRC. X1 & X2 areoutputs of XOR operation.

We can achieve the CRC of multi-byte using

less logic resources in this way. Although increase

two operations by make up two-byte word 0, it is in

parallel computing each time and the time only

relation to the transmission time of the trigger. And the consumption of the time is far less

than the LFSR circuit. So it is more obviousadvantages to the more number of bytes.

V. SIMULATION RESULTS

The following simulation result is for 8-bit inputThe CRC is “0101100011100101”.

The following simulation result is for 16-bit input

The final CRC is “1100111100011000”

The following simulation result is for 128 bytesinput

The final CRC is “1111011101110100”

.

VI. CONCLUSION

This paper analysis the principle of the CRCcalculation, a general method of parallelcomputing of CRC and the CRC algorithmsolution of the data packet is shown above.Examples designed in this paper will give a detailstudy regarding the parallel computation process.We implement the Xmodem protocol with CRCcheck use FPGA based on the above method.



REFERENCES

[1] Yuan Wang, Ming Cheng. How the CRC Algorithm

In Cmodem Protocol Implementation in FPGA

[2] YU Xun The 32-bit cyclic redundancy check

parallel algorithm and hardware

implementation[J].Information Technology, 2007.

[3]

JI Shang-man, LI Wei, SHEN Ke-jie, YAO Hui,TAO Zhi-jie. Improved CRC Arithmetic and

Implementation by SCM[J]. Industrial Control

Computer. 2009.

[4] ZHU Rong-hua. The Principle and Implementation

of a Parallel CRC Computing[J]. Acta Electronica

Sinica,1997.

[5] ZHANG Shu-gang,ZHANG Sui-nan,HUANG Shi-

tan.CRC Parallel Computation Implementation on

FPGA[J].Computer Technology And Development,

2007

[6] LI You-zhong The generic parallel CRC calculation

Principle and its hardware implementation[J].

Northwest Minorities University (Natural Science)

2002.[7] LI You-mou, FANG Ding-yi. CRC coding

algorithm research and Realization[J]. Journal of

Northwest University(Natural Science Edition),

2006.

[8] ZHANG Shu-gang,ZHANG Sui-nan,HUANG Shi-

tan.CRC Parallel Computation Implementation on

FPGA[J].Computer Technology AndDevelopment,

2007

Documents

Final Paper for Publication