Project Curve Tracer - Framing Serial Data
Date Mon 28 March 2016
By Jason Jones
Category dispatch
Tags dispatch c curve tracer pic24 project uart serial python checksum crc fletcher16

Framed

In transferring data, there are many layers that one can add in order to ensure that the communication has certain properties. For instance, up to this point, we have implemented the physical layer and part of the the data layer. The physical layer specifies thevoltages we are permitted to operate on and the data layer ensures that each element transferred is a byte.

In this post, we are going to elaborate on the data layer and enable arbitrary transfers that contain a series of bytes or a block of data. What those bytes mean won't be won't matter at this level, we just need to ensure that the data that is received is actually the data that was intended.

Framing Elements

Framing is a method of sending a coherent block of data and verifying that the received data is exactly the intended data.

Payload

The first element of the frame is the payload. In our data model, the frame payload has a minimum size of 1 byte.

Payload Integrity Verification

Once the sender knows the contents of the frame payload, it can calculate a CRC or checksum of the data and append that verification code to the end of the data. This verification code allows the receiver to verify that the payload is unchanged between sender and receiver.

Start-of-Frame (SOF) Flag

The Start-of-Frame (SOF) flag is a specific value that indicates that the frame is about to begin. The SOF will not occur at any place within the frame (see 'Escape Sequences' below) and will not occur again until the next frame starts. The SOF itself is discarded by the receiver as it is not valid data in itself.

End-of-Frame (EOF) Flag

The End-of-Frame (EOF) flag is exactly analagous to the SOF flag except that it delimits the end of a frame. All data between the SOF and EOF are data to be consumed by the receiver while the SOF and EOF flags are discarded by the receiver.

Escape Sequences

One might imagine that the payload itself might contain a byte of data that happens to be the of the same value as the SOF or EOF. What are we to do when this occurs? This is where we escape the character and switch it out for another so that the entire sequence between SOF and EOF remains free of SOF and EOF characters. Thus, we find ourselves with a third special character called the ESC character. As the data is being framed, each byte is examined to ensure that it doesn't correspond to the SOF, EOF, or ESC characters. If it does, then the ESC character is inserted and the offending character is XORed with a value that changes its transmitted value. When the receiver encounters the ESC character, the ESC character is discarded and the very next character in the sequence is XORed with the same value, returning the string of characters to its original value.

The numbers that were chosen for the escape sequences are as follows:

#define SOF 0xf7
#define EOF 0x7f
#define ESC 0xf6
#define ESC_XOR 0x20

There is nothing special about these numbers. They are simply chosen based on observation of other similar implementations.

Example

Assume, for a moment, that we have a sequence of data that is to be transmitted.
The data is the payload, so we will designate it PL[0] through PL[4] for a 5-byte sequence that must be transmitted. The first step in framing is to simply calculate the verification sequence. In this case, we will be calculating a fletcher16 checksum and append the value to the end as CK[0] and CK[1]:

Frame, non-escaped

We append the checksum to the end of the data. Next, we must add any escaping that is necessary. In this sequence, lets say that PL[2] corresponds to the value of the SOF, EOF, or ESC characters. Note that the checksum is subject to escaping as well.

Frame, escaped

Once all of the SOF, EOF, and ESC characters have been removed by the escaping process, we can add our SOF and EOF secure in the knowledge that there is no data in between that will coincide with the SOF, EOF, or ESC that isn't supposed to be there.

Frame, complete

The receiver now re-executes this sequence in reverse. It first removes the SOF and EOF flags and discards them. The receiver then goes through the sequence identifying ESC characters and, where found, discards the ESC character, XORs the next character with the proper escape value, and saves that value as the data. Once the data has been un-framed and un-escaped, the receiver then calculates the checksum on the data and compares it with the checksum appended to the end of the data. If the checksum matches, then the data is accepted as a frame. If the checksum does not match, the entire data set is discarded.

Framing Bandwidth Efficiency

One thing to take note of. As the features of the network increase, so, too, does the amount of metadata required. For instance, in our simple framing example above, it took 10 characters to transmit a single 5 characters. This is an abysmal 50% transmission efficiency just to frame the characters! The advantage is that we can send messages that are far longer than 5 bytes. If we were to send a similar message of a length of 100 bytes with a 1.2% escape occurance, then we would end up adding 6 bytes for framing (2 checksum, 1 SOF, 1 EOF, and 1.2 rounded up to 2 for escaping). This would result in 94% efficiency. For this reason, it is critical for bandwidth that higher-level protocols transmit as many bytes in a single frame as possible in order to better leverage the available channels.

C Framing Implementation

Pre-requisites

In order to implement framing, we must have two byte-aligned circular buffers in the software layer just below our framing layer. The software layer must have available four functions:

function	description
UART_readable()	returns the number of full slots in the receive buffer
UART_writeable()	returns the number of empty slots in the send buffer
UART_read()	reads a number of bytes from the receive circular buffer
UART_write()	writes a number of bytes to the send circular buffer

The underlying UART library must take care of sending and receiving all bytes written to and received from the circular buffers. In our case, we are using the circular buffers written about in a previous article to store UART data in both directions.

Framing Functions

We are going to write three functions, FRM_push(), FRM_pull(), and FRM_fletcher16().

function	description
FRM_push()	"pushes" data to the UART for sending after framing
FRM_pull()	"pulls" data from the UART for receiving and de-frames
FRM_fletcher16()	calculates the Fletcher16 checksum of the data

Pushing Data

Data is 'pushed' out from higher level software using the FRM_push function:

void FRM_push(uint8_t* data, uint16_t length)

An array of data and the data length are supplied to the function. This data corresponds to our payload from the initial framing discussions. At this point, there is no SOF, EOF, checksum, or ESC applied to the payload.

The C code is as follows:

void FRM_push(uint8_t* data, uint16_t length){
    uint8_t txFrame[TX_FRAME_LENGTH];
    int16_t i = 0, txIndex = 0;

    uint16_t check = FRM_fletcher16(data, length);
    uint8_t check0 = (uint8_t)(check & 0x00ff);
    uint8_t check1 = (uint8_t)((check & 0xff00) >> 8);

    /* start the frame */
    txFrame[txIndex] = SOF;
    txIndex++;

    /* copy the data from data to the txFrame */
    while(i < length){
        /* add proper escape sequences */
        if((data[i] == SOF) || (data[i] == EOF) || (data[i] == ESC)){
            txFrame[txIndex] = ESC;
            txIndex++;
            txFrame[txIndex] = data[i] ^ ESC_XOR;
            txIndex++;
        }else{
            txFrame[txIndex] = data[i];
            txIndex++;
        }

        i++;
    }

    /* append the checksum ensuring that the values are properly escaped */
    if((check0 == SOF) || (check0 == EOF) || (check0 == ESC)){
        txFrame[txIndex] = ESC;
        txIndex++;
        txFrame[txIndex] = check0 ^ ESC_XOR;
        txIndex++;
    }else{
        txFrame[txIndex] = check0;
        txIndex++;
    }

    if((check1 == SOF) || (check1 == EOF) || (check1 == ESC)){
        txFrame[txIndex] = ESC;
        txIndex++;
        txFrame[txIndex] = check1 ^ ESC_XOR;
        txIndex++;
    }else{
        txFrame[txIndex] = check1;
        txIndex++;
    }

    /* end the frame */
    txFrame[txIndex] = EOF;
    txIndex++;

    /* wait for the UART circular buffer to clear from
     * any currently in-progress transmissions */
    while(UART_writeable() < txIndex);
    UART_write(txFrame, txIndex);
}

The comments explain reasonably well what is going on, but lets take a few lines at a time anyway

uint8_t txFrame[TX_FRAME_LENGTH];
int16_t i = 0, txIndex = 0;

These lines simply declare some useful variables. The txFrame[] array is the place in which we will build the frame to be transmitted. The txIndex variable will act as an index for txFrame as the frame is being constructed. The i variable will be used throughout the function for looping and is declared here for convenience.

uint16_t check = FRM_fletcher16(data, length);
uint8_t check0 = (uint8_t)(check & 0x00ff);
uint8_t check1 = (uint8_t)((check & 0xff00) >> 8);

First, we calculate the fletcher checksum and store the 16-bit results into two 8-bit registers, check0 and check1. This calculation is done on the raw data before any data is stored on the frame.

txFrame[txIndex] = SOF;
txIndex++;

Next, we start the frame. Based on previous discussions, this should be clear, but note that we increment the txIndex. Each write to the txFrame[] must be accompanied by an increment of the txIndex so that data doesn't get overwritten.

while(i < length){
    /* add proper escape sequences */
    if((data[i] == SOF) || (data[i] == EOF) || (data[i] == ESC)){
        txFrame[txIndex] = ESC;
        txIndex++;
        txFrame[txIndex] = data[i] ^ ESC_XOR;
        txIndex++;
    }else{
        txFrame[txIndex] = data[i];
        txIndex++;
    }

    i++;
}

We now copy the majority of the data into the txFrame[] array. Note that we have an if condition that ensures that SOF, EOF, and ESC characters are identified and properly escaped.

if((check0 == SOF) || (check0 == EOF) || (check0 == ESC)){
    txFrame[txIndex] = ESC;
    txIndex++;
    txFrame[txIndex] = check0 ^ ESC_XOR;
    txIndex++;
}else{
    txFrame[txIndex] = check0;
    txIndex++;
}

if((check1 == SOF) || (check1 == EOF) || (check1 == ESC)){
    txFrame[txIndex] = ESC;
    txIndex++;
    txFrame[txIndex] = check1 ^ ESC_XOR;
    txIndex++;
}else{
    txFrame[txIndex] = check1;
    txIndex++;
}

Nearing the end now, each checkx byte that was saved previously is also checked to ensure that it doesn't coincide with the SOF, EOF, or ESC characters before saving on the txFrame[] array.

txFrame[txIndex] = EOF;
txIndex++;

Of course, the final step to constructing the frame is saving the EOF character. At this point, the frame is completely constructed and ready to be written to the UART using UART_write(). It is prudent to ensure that the number of bytes to be written will actually fit in the transmit buffer, so we place a check and wait statement. The software waits for UART_writeable() to return a value less than the length of the frame before writing to it.

while(UART_writeable() < txIndex);
UART_write(txFrame, txIndex);

Note that this method would still allow multiple frames to stack up in the outbound circular buffer, as long as those frames were relatively short. This effectively stalls the processor and keeps it from losing data. Care must be taking to write a reaonable amount of data through this interface or the processor will bottleneck at this interface.

I will leave FRM_pull() as an exercise for the reader. One solution can be found in the project github repository.

Python Framing Implementation

For the Python code, we will take a closer look at the de-framing mechanism. I should include the caveat that all code is subject to change, depending on what sorts of bugs are found. The presented code is what is working today.

First off, a class called Frame is created, which creates a thread called run. This thread will be run continually in order to extract frames into a list of lists rx_messages, which is part of the Frame object.

def run(self):
    while True:
        for element in self.port.read(1000):
            self.raw.append(element)

        # pull out the frames
        while (self.SOF in self.raw) and (self.EOF in self.raw):

            # find the SOF by removing bytes in front of it
            while self.raw[0] != self.SOF:
                self.raw.pop(0)

            # find the EOF
            for i, element in enumerate(self.raw):
                if element == self.EOF:
                    end_char_index = i
                    break
            frame = self.raw[:end_char_index]
            self.raw = self.raw[end_char_index:]

            # remove the SOF and EOF from the frame
            frame.pop(0)

            message = []
            escape_flag = False
            for element in frame:
                if escape_flag is False:
                    if element == self.ESC:
                        escape_flag = True
                    else:
                        message.append(element)
                else:
                    message.append(element ^ self.ESC_XOR)
                    escape_flag = False

            # remove the fletcher16 checksum
            f16_check = message.pop(-1) * 256
            f16_check += message.pop(-1)

            # calculate the checksum
            if self.fletcher16_checksum(message) == f16_check:
                self.rx_messages.append(message)

        time.sleep(0.1)

Again, a few lines at a time...

    while True:
        for element in self.port.read(1000):
            self.raw.append(element)

The code will be executed for as long as the Frame object is in memory. Also, the code will continually read up to 1000 bytes from the serial port self.port and append to self.raw in preparation for processing.

        while (self.SOF in self.raw) and (self.EOF in self.raw):

When self.raw contains at least one SOF and one EOF chracter, process the detected frame.

            while self.raw[0] != self.SOF:
                self.raw.pop(0)

These lines remove any residual data from in front of self.raw up to the first detected SOF.

            for i, element in enumerate(self.raw):
                if element == self.EOF:
                    end_char_index = i
                    break
            frame = self.raw[:end_char_index]
            self.raw = self.raw[end_char_index:]

The frame list is created from the values between the SOF and EOF of self.raw. This swath is then removed from self.raw so that it doesn't get re-processed later. From this point on, the data of concern is within frame.

            frame.pop(0)

            message = []
            escape_flag = False
            for element in frame:
                if escape_flag is False:
                    if element == self.ESC:
                        escape_flag = True
                    else:
                        message.append(element)
                else:
                    message.append(element ^ self.ESC_XOR)
                    escape_flag = False

The message list will store the final de-framed data. Note that any escaped variables are caught before appending data to message. This is a stateful execution, meaning that the state of this code is either 'escaped' or 'not escaped' using escape_flag.

            # remove the fletcher16 checksum
            f16_check = message.pop(-1) * 256
            f16_check += message.pop(-1)

            # calculate the checksum
            if self.fletcher16_checksum(message) == f16_check:
                self.rx_messages.append(message)

The last two bytes of the received data contain the Fletcher16 value. This value is first removed from the message and integrated back to f16_check. If the calculated checksum value is the same as f16_check, then the complete processed data is placed into self.rx_messages for later processing. If the check fails, the data is discarded.

Testing the Framing Implementations

For a test of our implementation, we are sending a short frame from the micrcontroller to the PC using the presented code.

uint8_t txData[] = {0,0xf7,0,0x7f,0,0xf6,6,7};
uint16_t txLength = 8;

FRM_push(txData, txLength);

Note that we chose values in txData so that the escape sequences would be necessary in order to transmit the data. We displayed the data on the console using the print statement: