Hello!
This article will discuss floating-point numbers and, in particular, the implementation of the FP23 specialized format on programmable logic integrated circuits (FPGAs). As part of a specific project, I had the idea to implement the floating-point data format that is optimal for certain needs. As a result, this idea grew into a real project, which later found application in some interesting problems of digital signal processing. The article discusses the main difficulties in implementing the floating point data format on the Xilinx FPGA, describes the basic mathematical operations in the FP23 format. Also at the end of the article you can find the source code of the project, which can be used freely in your tasks or on its basis implement similar data formats.

')
Data format
Numbers with a floating point are actively used in modern signal processors, video cards and even in FPGA. In accordance with the
IEEE 754 standard, they are represented as a set of exponents (exponent), mantissa and sign. Numbers in the IEEE 754 standard have 32 bits (
float ). For the sake of unambiguity, the numbers written in float are written in normal form or reduced to normalized form after mathematical operations and conversions on numbers. The range of possible values depends on the number of bits of the mantissa and the exponent. Float has a fixed relative accuracy and variable absolute accuracy. The main advantages and disadvantages of numbers in float format:
- + A very wide range of possible values is provided;
- + High accuracy is achieved;
- - Rounding of very large numbers to the values possible from the grid;
- - Loss of accuracy when summing numbers that are different from each other many times;
- - The complexity of implementation and application on the FPGA.
For modern FPGA chips, the implementation of floating-point numbers is easy. For the leading manufacturers - Xilinx and Altera there are ready-made SOFT IP-cores that implement float according to the standard. For Altera, the latest families feature HARD IP cores that implement the IEEE 754 format on DSP blocks.
Digital part
The main element base for performing mathematical operations and digital signal processing tasks is the advanced FPGA families. The work is carried out on the FPGA company Xilinx, which contain blocks of vectors of configurable cells. First of all, these are universal blocks of digital processing and mathematics - DSP48E1 (for the FPGA of the Virtex-6 and 7 series of the family) and DSP48E2 (for the FPGA of the UltraScale family). These blocks allow you to perform the functions of addition and subtraction of numbers, shift and transfer, high-speed multiplication, multiplication with accumulation (MACC), search for checksum, etc. The DSP48 cells in modern FPGAs perform many tasks — developing CIC and FIR filters, speeding up FFT / IFFT, implementing digital frequency synthesizers (DDS and CORDIC), creating various data presentation formats and a set of operations for them.
The second basic component of the FPGA is block memory. For the Xilinx FPGA, these are RAMB36E1 (or RAMB18E1) cells. These are elements of two-port memory, implemented inside the crystals in the form of columns of identical blocks. The number of columns of internal memory depends on the volume of the crystal. The maximum memory block size is 36Kbps, and it can be configured as two independent blocks of 18Kbps each. Memory can be used with data widths from 1 to 72 bits. In the absence of free resources of the block memory FPGA, you can use the distributed memory in special cells SLICEM. For Ultrascale FPGAs, the amount of such memory is up to 512x1, with the ability to reconfigure the depth and bitness of the bus.
Formulation of the problem
Taking into account the listed features for the implementation of the analog and digital parts, the task is formed as follows. Let the bit width of the data arriving in the FPGA chip with ADC microcircuits is 16 bits. The smallest Kintex-7 chip is chosen as the FPGA. The task is to translate the 16-bit signed integer number FIX16 into a specialized floating-point format FP23 and vice versa, while ensuring
maximum processing speed inside the FPGA with
minimal expenditure of crystal
resources . In addition, it is necessary to designate and implement a set of basic mathematical operations - addition, subtraction and multiplication.
FP format implementation
FP23 is a special floating point representation format. Unlike the IEEE 754 standard, the word length in the FP23 format is not 32 bits, but only 23 bits. The following formula in a simplified form reflects the representation of the number "A" in FP format:

The
mantissa (
mantissa ) in FP23 format is a 17-bit positive number. It is always displayed in normalized form as a result of all transformations, i.e. the highest bit of the mantissa is always “1”. To conserve crystal resources and simplify writing, the 17th bit of the mantissa is never displayed in the packed 23-bit resulting word (see figure). Therefore, when calculating mathematical operations and transferring from one format to another, it must be remembered that the mantissa is normalized, and a 16-bit nonzero number must be added to 216. The exponent or
exponent in FP23 format is a 6-bit positive number that determines how much the mantissa is multiplied. Under the sign (
sign ), the remaining one bit is allocated, the zero value of the bit corresponds to positive numbers, and the single value determines negative numbers. Thus, the number A in the FP23 format is described by three bit fields: the “exponent”
(EXP (A)) , the “sign”
(SIGN (A)) and the “mantissa”
(MAN (A)) .

Table 1 shows the data width, the range of possible values of mantissa and exponent, etc.

So, the first step in developing a floating-point data format on the FPGA consists in translating the signed integer number of a fixed width into the FP23 floating-point format, while ensuring the maximum processing speed inside the FPGA. The second stage consists in the inverse transformation of numbers from the FP23 format to the fixed-point format with the possibility of scaling the output data to a certain division factor. And finally, at the third stage, in order to fully utilize the full power of the FP23 format, it is necessary to implement mathematical operations - addition, subtraction and multiplication.
Before describing the FP23 modules, you must answer the legitimate question: “Why is the data width in the format of exactly 23 bits?” The answer consists of several components. First, one bit is used for the sign and the 15 bits remaining for the whole part, so the mantissa and the sign occupy at least 16 bits. For an exponent with a shift in the formula, a minimum of 5 bits is required. The more bits allocated to the exponent, the greater the result (the range of possible values is expanded). Let the exponent have 6 digits, and the mantissa is represented as a 16-bit number, 1 digit is assigned to the sign. Then, 23 bits are used to represent numbers in the FP23 format. If you use the hidden bit of the mantissa, which is 1 for non-zero values and 0 for zero values of a binary number, then the bit depth is 24.
In addition, the value "23" was not chosen randomly, because in future, the storage of the vector of complex values will require FPGA memory. A pair of values without displaying a hidden bit has a total bit width of 46, with a hidden bit - 48, which is exactly three RAMB18E1 blocks of data width. Also, for a pair of values, you can use the additional functions of the DSP48E1 ¬ block — a quick comparison of two numbers, a parity check, and a fast summation in SIMD mode.
FIX16-to-FP23 conversion
To convert data from the integer signed type FIX16 to the FP23 format, it is necessary to use the FPGA chip logic and the built-in multipliers of DSP48E1. The figure shows a block diagram of the data conversion in FP23 format.

The maximum bit depth of the multipliers of the DSP48E1 blocks is 25 and 18, respectively, where the most significant bit is sign. To multiply numbers without a sign in the DSP48E1 module, the effective width of the input data is reduced by 1 bit. RAMB18E1 block memory is not used by the converter.
Accordingly, in order to find the sign of a number, it is necessary to take the most significant (sign) bit of the input data. The algorithm for searching for the MAN (A) mantissa based on FPGA primitives is as follows:
- take the input number module (XOR operation),
- search for the most significant bit (MSB SEEKER),
- on the most significant bit to form the magnitude of the shift of the mantissa,
- multiply the modulus of the number and the shift value, resulting in a normalized mantissa (FRAC SHIFTER).
To search for exponent exp (a) you need:
- take the input number module (XOR operation),
- search for the most significant bit (MSB SEEKER),
- subtract from the number “32” the inverted index of the significant bit,
- add 1 to the result, taking into account the logic of finding a non-zero input value (LUT ZERO FORMER).
Since the mantissa is obtained by multiplying the modulus of a real number by a certain “mask” on which the shift of the input number depends on an unsigned number, the DSP48E1 block is required to find it. The procedure for searching and forming the mantissa occurs in the FRAC SHIFTER node and is shown in the figure.

As can be seen, the mantissa is formed in two stages. In the first stage, a factor of 2
16-MSB is created from the input number using logical LUT resources. In the second stage, the input data is multiplied by the amount of shift in the DSP48E1 node. At the output of the multiplier, a normalized mantissa is formed. According to the description, for the multiplication of unsigned numbers, the maximum width of one of the factors of the DSP48E1 block is 17, therefore, the mantissa and the generated shift cannot have a greater bit depth. This restriction once again causes the fact that the mantissa of numbers in the FP23 format has a bit depth of 17. The most significant bit of the mantissa for non-zero numbers is 1, reserved and used in the search for the resulting mantissa, but is not displayed in the output representation of the number for the normalized form. This bit can be output and used in the general representation of the word, then the word length of the output word is 24 bits.
To reduce the amount of crystal resources occupied, the DSP48E1 unit can be eliminated. Then the mantissa search node is converted to a fast shift register (Barrel shifter). For some FPGA crystals, the number of DSP48E1 is small, and unnecessary waste of resources is critical, so when using FP23 blocks, it is possible to choose one or another option. In addition, for the case of using the fast shift, the total delay for the complete execution of the conversion operation is reduced by one cycle.
Summation nodes for search exponents are made in the form of full binary adders, for the implementation of which the logical components MUXCY, XORCY and CARRY CHAIN are used, which are included in the basic SLICEL and SLICEM cells. A detailed description can be found in any literature on programmable logic.
Sample time diagramsConverting data from FIX to FLOAT FP23:

DIN - input data, 16 bits. DOUT - a number in the FP23 format, divided into three fields:
{EXP, SIGN, MANT} .
Convert FP23-to-FIX16
Converting data from FP23 format to FIX16 is done in four steps.

The binary sign A search algorithm consists of several sequential actions:
- from exponent EXP (A) subtract the value of SCALE (output scaling),
- create a shear mask 2 (EXP (A) -SCALE),
- add the hidden IMPL bit (A) to the mantissa and multiply the result by the amount of shift,
- take into account the sign of the number of SIGN (A) by taking the operation "exclusive OR".
As for the direct conversion node to the FP23 format, in this scheme, to save FPGA resources, the DSP48E1 block can be replaced with a fast shift register executed on SLICEM cells.
Multiplication
Multiplication of numbers in the FP23 format is one of the most simple and logical operations. The hardware multiplication algorithm is implemented as follows (see figure).
- multiplication mantis in the node DSP48E1,
- normalization of the mantissa (taking as a mantissa of bits [32 ... 17] or [31 ... 16], depending on the value of the most significant bit),
- addition of exhibitors
- subtracting from the sum of the exponent of the number 16,
- if the 33rd bit of the mantissa is “0”, then 1 is subtracted from the result exponent,
- definition of the mark of the work using the XOR operation.

The scheme does not reflect the logic of multiplying by zero, but algorithmically it boils down to the fact that if the exponent of any input number is zero, then the result at the output is also zero. On the FPGA, the search for zero is organized using the logical functions AND and OR on the base blocks of the LUT.
The number 16 is subtracted from the sum of the exponents to account for the hidden bits in the mantissa of the input numbers, which give an increase to the exponents. Adders and subtractors in the node multiplying numbers in a floating point are also implemented according to the scheme of a binary full adder.
Addition and subtraction
Adding and subtracting numbers in the FP23 format is the most resource-consuming operation (see figure). Adding numbers in the FP23 format consists of the following fundamental stages:
- casting operands to one exponent (alignment),
- addition of mantis,
- normalization of the result (selection of such an exponent so that the 16 bits of the mantissa is equal to "1").

Hardware addition algorithm is implemented as follows:
- operands “A” and “B” are compared in absolute value if | A | <| B |, the numbers are swapped,
- the difference between the exponents of the operands is calculated, and the result determines how many digits you need to move the mantissa of the number “B” to the right to bring the numbers to one exponent,
- by the difference of the exponent, the number is generated by which the mantissa of the number "B" is multiplied
- the result of multiplication is added to the mantissa of the number “A”,
- in the resulting number is determined by the number of the most significant bit of the MSB, the number is formed,
- normalization of the mantissa by multiplying the sum of the mantissas by the number,
- exponent of the number EXP © = EXP (A) - MSB + 1.
Subtraction is implemented similar to addition. The only difference is that the sign
SIGN (B) is inverted in the subtracted operand. DSP48E1 blocks can also be replaced with a fast shift register.
Resources
The results of the synthesis and the propagation delay for the operation for each node are summarized in Table 2. The table shows that the simplest and fastest operation is to multiply two numbers in the FP23 format.

The volume of resources occupied after synthesis for all nodes is given in a common log file. An example of the results of the synthesis for the multiplier fp23:
Top Level Output File Name : fp23_mult_m1.ngc Primitive and Black Box Usage: ------------------------------ # BELS : 106 # GND : 15 # INV : 1 # LUT2 : 25 # LUT3 : 16 # LUT4 : 6 # LUT6 : 14 # MUXCY : 14 # VCC : 1 # XORCY : 14 # FlipFlops/Latches : 75 # Shift Registers : 8 # SRLC16E : 8 # DSPs : 1 # DSP48E1 : 1
Source
All FP23 format nodes are written in VHDL. For convenience and testing in C ++, a small program is written that contains the functions of the implemented operations in a floating point on the FPGA. With its help, you can debug and implement other formats with a floating point with different dimensions of the exponent and the mantissa.
For the convenience of reading data on VHDL, a type is created that defines a number in floating-point format. All components and types of FP23 are collected in the file
fp_m1_pkg.vhd type fp23_data is record exp : std_logic_vector(5 downto 0); sig : std_logic; man : std_logic_vector(15 downto 0); end record;
The source code of the RTL descriptions, the testing program and the results of the synthesis can be found at the link to
github .
Test case
To test all the developed nodes, I came up with a small circuit. This is a complex multiplier, at the input and output of which data is presented in integer form, and intermediate results in a floating point. That is, the input data is converted to FP23 format. In this format, complex multiplication is implemented, and the result is transferred back from FP23 to a fixed point. The block diagram in the RTL Viewer is as follows:

In PlanAhead, the location of the complex multiplier is shown in the following figure. FPGA Kintex-7,
XC7K70TFBG484-1C .

It is seen that as a result of the complete project tracing, a processing frequency of about
300 MHz is reached . In practice, this figure is
~ 30% higher.
In the FPGA Editor, the regular multiplier block in the FP23 format looks like this:

Conclusion
As a result of the work, a specialized data format for floating point FP23 on the FPGA was developed. It differs from the traditional IEEE 754 format and is sharpened for processing on the FPGA at maximum speeds. Unlike standard solutions from Xilinx and Altera, nodes in the FP23 format occupy significantly less crystal resources. The operations of addition and multiplication in FP23 have been used to implement FIR filters and an FFT / OBPF node on FPGAs.
If you wish, you can create your own data format "
FP_X " for specific tasks with an arbitrary bit depth of mantissa and exponent. In this case, it is necessary to take into account the features and structure of specific FPGAs.
My articles on similar topics:
Literature
Thanks for attention! To be continued...