📜 ⬆️ ⬇️

SimBank or 200 SmartCard in one box

As I wrote in my first topic on Habré about the creation of a printed circuit board, my company is developing devices for telephony of various kinds of VoIP, GSM, PBX gateways, sockets-rebooters controlled via GSM and so on. Today I will describe the process of developing firmware for the SimBank device in its PCI version, where I acted as the developer of the FPGA circuit.

The history of the task from the manual + the history of the device


The SimBank device is designed to work with SIM cards (Subscriber Identification Module - subscriber identification module). These cards are a variety of ISO-7816 cards. It can be used for centralized storage of SIM-cards or other types of smart-card, and their subsequent use in devices like GSM-gateway, satellite TV tuners or in other devices using smart-card during operation. Together with the software, Sim-server provides a flexible system for monitoring and accounting for the SIM cards used in your applications, ample opportunities for configuring and configuring the operation of the system through a convenient web-interface. At the same time, the cards themselves are stored in an accessible place for you and are connected to the terminal devices using the TCP / IP protocol.


On Habré there are articles describing devices and the principle of working with smart cards:
')
A quick introduction to OgreSwamp SIMs
Smart cards for the smallest by brake
How does the smart card from rlepricon
I enjoyed reading them all before starting development. And I will not repeat what is written in them.


SimBank device

The device itself was made at the company for a long time and successfully used for several years. But the peculiarity of all PCI devices at the company was the training PCI core, “which was supplied as is and no complaints were accepted”. They are not exhibited. With minimal rework, this core was installed on all cards without exception. This allowed to skillfully use the available address space. With the same Vendor ID, Device ID, Class Code. Since the devices are usually assembled at the company and checked for operability, there were no conflicts. With limited time frames, the principle “works - do not touch” is quite appropriate. This continued until some time until some modern motherboards showed instability in their work, or the board did not want to start at all.

It is time to more thoroughly deal with the PCI core. The training core refined for its tasks was documented by comments, but they were not always enough. The reference to the place where it was taken from, dated 2002 or 2004, did not work.
The task is simple as day: “The device should work on all standard boards”.
As a bonus: “Even if these are not all 200 SIM cards, but 196 or 1xx”.
Up to this point I have never worked with SIM cards or with a PCI bus as a developer of FPGAs. Only scheme and fee. Development experience on VHDL and Verilog was already.
Disassembly, it was decided to start with PCI, since the part relating to SIM cards worked.
The first reading of the “PCI Local Bus” specification did not work miraculously and did not shed much light on how everything should be done. Having read articles on how to do and how not to do it, I didn’t understand anything yet, I decided: “I have to do it, I’ll figure it out as I go.”
For PCI, Altera in the Quartus environment has a Mega-function that allows you to generate a PCI core with the parameters we need. Good documentation is attached to the core, the site has a description of a PCI whale, to which a PCI model is attached, for behavioral modeling in ModelsSim-Altera. All this together - it helps a lot to get started from scratch in this topic.
Altera also allows you to use your Vendor ID (1172h).
In the PCI specification, a device class is selected that is not available in earlier specifications: 07h - Simple Communication Controllers, 05h- Smart Card.
Then everything should work in Linux. As a colleague has more experience with both SIM and PCI cards, there are technical requirements for him. For which he thanks a lot. A well-written document is half the battle. Although it requires "extra scribbling."

Study material


Since the device was developed not by me, I also needed to get acquainted with the design of the circuit and the board.
The circuit part is necessary for understanding what comes from where and what it joins, and the board makes it clear where to insert the programmer, SIM cards, and where this or that service indication is located. About how it played a cruel joke with me, I will describe later.



According to the scheme, we have a board with two Altera Cyclone II EP2C35672C8 FPGAs in a 672-pin package. One FPGA is attached to the PCI bus and it controls the second FPGA. Each FPGA attached 100 holders of SIM cards. Each SIM card has its own reset signal and data, the CLK signal goes to a group of 10 cards.



Each EP2C35 chip contains 105 M4K internal memory blocks. In total in the RAM block (including the parity bit) 4,608 bits.

The blocks support different configurations from 4K × 1 to 128 × 32 (36).

After that, I had already written a document with additional questions, what will be submitted for what, who is monitoring the duration of certain signals. Who considers the correctness of the data from the SIM and so on.

Having dealt with the pins of the first PCI bus chip, I made a pilot project of the PCI core in Quartus, which was supposed to make money with new parameters on the bus. Without any logic, just the core.
Having received the answer to the “lspci -vv” command, my happiness knew no bounds.



Having discussed the further course of work with the driver and application developer, we made a pilot project in which the driver wrote data into memory, and then read and compared them. Initially, verification was done only on the first chip in the circuit.
Further there was an analysis with the second microcircuit.
Between the two chips in the diagram, there are 36 signal lines + 2 lines between the PLL_OUT and CLK outputs. In the previous project, 19 address lines were used and two bytes for data exchange (one on each side) read signals and a write signal. This was done based on the specifics of the operation of the existing core. And to obtain the versatility of the driver and programs. So for example there is an established part of the addresses from where you can subtract service information.
Type ARM information is stored



Elgato G4_1 - device type (for PCI cards “K16” or “K32”, for Simbank - “SimBank”)
SIM51215 (SIM900 or other) - the type of the installed GSM or 3G module (required for selecting a set of AT commands). In terms of the device being described, it concerns which SIM card we can bring to the operator’s network through this module.
2014 - year of development
ver.14.144 - firmware version number
SN: 0123

And if you did not forget to change it to the correct one before the firmware, then using this data you can find out a lot of interesting things from the “barn book”, where it was written for whom the project was made and what differences the customer wished from the standard.
For Simbank serial number is enough. Since constructively Simbank fee is always one in the device. And in the PCI version, such a board is one in the firm.

Painful decision


In the previous version of the kernel, two BARs were used. BAR1 with 1M memory mapping, BAR0 as 4K input / output registers. The entire address space was given to both FPGAs simultaneously.
It was decided to abandon the use of IO for various reasons. In the first place, a driver developer is much easier to work with memory. Secondly, there are many recommendations in the network why, if possible, it is worth to abandon IO in new developments. Third, the amount of IO resources available for tuning in the Altera core is limited to 256. There are much more memory resources. But this transition entailed a change in the work with modules of SIM cards. And it meant almost complete rework of the exchange module with the SIM. By this, a departure from the usual addressing means that we cannot read the bitics we need with any of our programs in the usual place.
You could not do at two base addresses and get along with one. Do not save one clock signal when exchanging with the cards of the first chip. This part of the data exchange is not involved in voice transmission and we do not need to save milliseconds in order to get a delay of no more than 20 ms. But this seemed to me correct.

Structure


After the first successes and verification of the recording of test data in memory, it is time to build a structure for exchanging data with SIM cards. I had a rough idea how to do it a long time ago, but after the tests I was ready to assume that it would work. And the tests have been done were thought out and done not in vain.
The existing 36 + 2 signal lines were not enough to transmit the PCI bus overhead signals to the second chip without changes, and I didn’t want to limit the transmission to one or two bytes. Even if it is possible to spread the bus address and data.
Therefore, 32 of the 36 signal lines were immediately given to the data bus (and addresses), 1 signal line controlled the general system reset, and on the other three we hung the whole control. To transmit the CLK signal, we had a separate line.
Silent mode, address transfer mode, single data write mode, stream data (Burst) mode, mask write modes, read and stream read. A total of eight bus conditions that fit into three signal lines.
It remains only to transmit four BE signals for recording by mask. It was decided to hide them in the upper four bits on the data bus if we use this mode.

For each SIM card in the FPGA a maximum possible buffer was allocated. Since a single-wire half-duplex interface is carried out with a Smart card, we have 100 cards and only 105 memory chips in the microcircuit, it was decided to make one buffer for reception and transmission. And zeroing addressing at the beginning of the transfer or reception. I was lucky and according to the standard from the SIM card can not come more than 256 bytes of data per command. So the available buffer will not overflow until we subtract it.
The term “team” is slightly different from the ISO-7816 standard. There, the team refers to a card, then a confirmation from the card, and then the data. But I am interested in the amount of data from zeroing to zeroing the buffer.

It was decided not to make the buffer for data exchange with the second microcircuit, but to directly write to the memory for SIM cards in the second microcircuit. Since I have already occupied 102 memory blocks out of 105. The choice of the first and second microcircuits is selected by referring to BAR0 or BAR1, respectively.
For BAR0 and BAR1 made different controllers. Appeal to the second chip occurs at a time later, but this is governed by the delay in setting the TRDY signal on the side of the PCI bus.

For each chip, 64K of memory is allocated, which is enough to serve 128 Smart cards. We have 100. There are still 28 blocks. In two blocks we will place service information where information about the state of card reset and exchange rate will be stored. Back we need to return the amount of data received from the SIM.

Then all this needs to be made to work and debug. Since the project for 100 SIM cards is compiled for a very long time, and 100 SIM cards are much more expensive to burn than 1 or 10, it was decided to debug the exchange immediately on 1 - 5 cards, and then add all the rest. And to debug only the first half, and when we decide that everything turned out, we are going to the second chip.
And then a surprise was waiting for me.
There is a scheme in which the contacts of the cards are numbered from 1 to 100. There is a project of the board from which the FPGA contact settings are successfully imported (PCI is working, the test record in memory is working, the exchange between the microcircuits also works). The project is compiled for 5 cards, so that if I make a mistake with the beginning of the numbering from 0 or 1, then the second card would work. Dudka.
Well, nothing, we have 10 LEDs on the board. And even though I was going to deal with them later, then I had to push a little closer to now. We add a simple thing to turn on / off the LED, and the light bulb does not light up, the LED does not flash.
Take the oscilloscope. CLOCK is, but there is no reset. There is no data either.
Import of data pumped up? We compare the scheme with the board, the board and the scheme with Pin Planner in Quartus. Match up. And there is no reset.
And here I come to the aid of a senior comrade. And he says that the “princess in another castle” in the program “SIM0” is on the other side of the box. And that has always been the case.
Here you can take a screenshot from a PCB project in P-CAD. Signals are there

It is SIM 199 on the board, and SIM0 in the project. The project is compiled for only 5 cards. (For 100, he just still does not fit). And so the other 95 cards are simply not served. And for them, even the memory buffer in the FPGA is not formed.

In order to save resources, it was decided to clock the memory blocks with the frequency at which the SIM cards work. So less will need dividers. And with a lower frequency will be easier to get into the high-speed resources of the chip. It was planned to write to the buffer for the SIM card at the PCI bus speed (33 MHz), and read at the speed of work with the SIM cards. Due to restrictions on the implementation of dual-port memory in Cyclone II, I had to look for other solutions. As a result, it turned out that the simplest and most correct was to clock all 33 MHz blocks from PCI_CLK. A reduction in the number of multi-digit counters had to be done by using permission signals for clocking registers. In addition, this approach made it easier to write time constraints for TimingAnalyzer using the multycycle.

Setting up the model. To successfully complete all inventions, we need a model. For the PCI kernel, the model provides Altera. For Smart cards, I had to write a model myself.
A minus of such a decision, if I misunderstand how a SIM card should work, then I will also incorrectly write a model. But everything worked out. True, not immediately.
The simulation allowed the first couple not to distract a colleague, although then at the debugging stage in the hardware we had to thoroughly understand. What is wrong in the model, and how it should be.

T0 / T1 or don't do it better, do it right


Initially, the work on resetting and reading from the ATR SIM card, changing the speed and transmitting a simple “A0 A4 00 00 02” command was debugged in the automaton of working with the SIM card, and all that needed to be answered was “A4”. I wanted to make a machine that would read everything on power-up itself and by the time the OS loaded it worked with the SIM card at the maximum speed for it. This would allow to unload the program from the initialization of two hundred (200!) SIM cards during the launch.
With some SIM cards, everything went great, and some stubbornly gave out h'3B and were silent. As a result, the timer worked in the machine and I dropped the SIM card. Since she did not tell me what speed to talk to her. If I received three or more than three bytes, then I thought that the ATR was read and it was possible to work further. Although everything was great on the model, but as it turned out, different SIM cards are issued ATR differently. Some give all the information at once, while for others, after sending the h'3B header, there is a pause, and then all the information is output. Increasing the pause solved this problem. For the time being. Some SIM cards give the final bit with the same long pause as in the beginning after the h'3B byte.
Solved. No smart machines, spontaneous switching speeds and other things. Let the processor think about it. Decides when to issue a reset, remove it, and let it count when a portion of the data from the card is completed.
But there is a problem. We have not yet configured all the buns, but we have only 80-85 SIM cards out of the required 100 in each chip. Let the operating system driver do this. He is not difficult, but we are more spacious.
Then debugging work with other commands already received from the driver from the OS went. Debugging on a large number of SIM cards. And the new “discoveries”.

197 out of 200


How to decide if 197 out of 200 cards work, but not three, is this an exact job or not? The testing was carried out on the SIM cards blocked by the operator, which still responded to the reset and pretended to write SMS to themselves. Maybe three cards are blocked for good reason and they are wrong? As I was naive.
The first work in real mode showed that from their side everything is correct. Search
And I found. Not right away. Helped my colleague who organized the exchange logs for all SIM cards. It turned out that sometimes when I wrote down large teams, I jumped over the counter. And I wrote one byte twice in the middle of a long transmission. Since behavioral modeling is a long business, I did not use long commands in the model. Limited to short transactions. Did the truncated sizes of counters. Made a step, modeled, checked, earned. Little “Hooray!” And continue on. Recorded byte, word, double word. Passed a couple of addresses - and then everything will be fine.
And so, on a small model, everything is fine, but in life it does not work. At the same time, until the error, the SIM card has time to give its number (if it is recorded in the card), ICCID, IMSI and something else. And when simulating write / read to the SIM card. Not always, only three out of two hundred cards fail.
Manual comparison of the program log and the logic analyzer showed that the error occurs in the body of the SMS message. In this case, 197 cards skip this error through the fingers CRC count. Why strain for some kind of SMS, and three cards waited for their finest hour.
I finished the model launched, and again everything is in order. How so?
I run the model with a full check of counters (we divide 33 MHz by 6, then by 372, and then again and again, instead of conventionally divided by 4, and then by 16). We will launch the model in the evening, hoping to see everything in the morning ...
And in the morning we catch our OS asleep, like a child’s sleep. She turned off the screen in power saving mode. But we did not get to sleep with her. After 4+ hours of modeling over 300 signals and tires, she fell asleep “without taking off her shoes”. And in the morning I started from the third time. What a joy that I did not have to restore the entire system.
Day of work, sleep disconnection. And the next morning I was able to find a mistake. The truth is, before that, I spent the whole day in disassembly with all sorts of other wisdoms. And in the morning of a happy day, I already guessed where to look for a mistake.
She was not the last, there were other more obvious and easy in terms of mining errors.

We collect mozayka


Having made the assumption that the work with the Smart Card is stable, I connected the second chip. First in the project. Then on the model. Then in the gland. Not all at once, but we overcame this part. How to check that all 200 places work at the same time and do not interfere with neighbors? Where to get so many SIM cards? There were 200 pieces on the bottom of the barrel. Launched.
Started, gave the bank to check. And here is another surprise. If all places are occupied in SimBank, then everything is fine. Cards work stably, are not reset often. And if part of the cards is not installed, then a periodic reset comes in their place. But sometimes the reset comes on symmetrical cards of the neighboring FPGA. Why?
I still don’t really know the answer to this question. But now I know what Signal Tap II Logic Analyzer in Quartus from Altera is. With the help of it, it was possible to pull out the states of signals on the PCI bus through JTAG. By changing the project and creating new signals for the start of the reference or stop, it was possible to view some parts inside the FPGA. So for example, I was surprised to learn that when reading or writing more than eight bytes, I almost never turn on Burst mode. By writing, never at all, and when reading no more than two PCI cycles. Although from the program comes the command “read 40 bytes”, and the reading is four bytes each. From setting the address to PCI



The problem of discharges was partially solved programmatically by the driver, partly by the restriction not to use an odd number of SIM cards in the bank until better times.

Question to the audience: is there a possibility in Linux (Centos 6 32-bit) to write all the call logs to the device? It is advisable not to disrupt the operation of the PCI bus. So that you can set the Vendor ID / Device ID and save its logs of calls R / W on the bus?

Total


As a result, by the end of all the improvements, the full project in the FPGA consumes 89% of logical elements and 85% of the available internal memory. Everything fits, and this means that the elementary base and the structure of the project are chosen correctly.
The scheme saved two 50 MHz oscillators, since the project only takes a PCI clock signal of 33 MHz, they are not used and may not be installed on the board in the future. The project works with all motherboards “slipped” to date.
The device can register all the cards on the network at the same time, if only there are enough GSM channels.

Now, in addition to updating the firmware and drivers of customers, several devices are working with us. Some send SMS for the SprintSMS project. Others are “guarded” by our technical support and work for the benefit of the customer. This made it possible to understand the essence of the customer’s work, and what he really needs. Both in terms of ergonomics of work and in terms of additional software functionality. In addition, a stably operating device (tuk-tuk-tuk so as not to jinx it) allowed for better software debugging.

But that's another story.

Thanks to all.
Readers for their interest in the article. Authors of other articles - for useful lessons. To creators of Habr for Habr. Colleagues for help in difficult times and "provided" experience. Leader - for understanding.

PS As a bonus, a picture from the web-interface of visual control over the work of SimBank.
Special thanks to him for Maxim.


Here SimBank is launched in 100-card mode.


List of cards with information read from them. On cards that require entering a PIN code, some of the information is not available.


Web-based interface is also available from smart.
Yellow highlighted cards that are currently in operation.

Source: https://habr.com/ru/post/240943/


All Articles