I want to share with all readers an interesting topic, which I faced recently, and I liked it. The development of this topic gave me pleasure and added some experience to the piggy bank. Probably many, and maybe not, faced a pool of database connections. After reading this interesting option, I wanted to write an article and share it with you. Perhaps the article will turn out to be a bit long, but I think that this post will still be interesting for someone to read, and this topic will interest him. Maybe someone uses this article in this project, in any case, it will be interesting for me to write and tell it to you.I am engaged in freelancing. Somehow, one of my acquaintances brought me to the owners of a driving school, and there I was asked to make a program to form my collection of tickets. Something like your traffic book tickets. All questions are stored in MySQL DBMS in three tables, where in one collection of questions, in another subject of these questions and in the last answers to questions. As a result, 800 questions with or without a picture, including answers. The initial form of the formation was a conventional method with a consistent formation of questions inside. I am quite interested in the topic of multi-threaded program execution, so after creating a fully working method, I decided to make everything more convenient, but at the same time I added sample speeds.
To begin with, in order to form your version of the SDA tickets, you need to take into account the subject matter. In this case, only 12 topics. The first 8 topics consist of 40 questions, the remaining 120. Each topic corresponds to a specific number in the ticket, with the first 8 topics corresponding to 1 question from the ticket, the other 3 questions from the ticket. To store the generated version of questions, the Dictionary is used, where each key contains a list of questions on a specific topic. In this case, the order of questions should always be different, i.e. you need a method that will generate a sequence of numbers from 0 to 40 without repetitions, so it turns out to choose any ticket and, accordingly, a question from it. Taking into account everything, such an algorithm for the formation of all questions is obtained:
- we get all the topics that we have;
- from the topics we form a list of questions for each of the topics;
- generate random numbers;
- based on the list of questions and the sequence of random numbers, we make a sample and form a list of questions on the subject;
- we put each list in the dictionary.
The implementation and operation of the first two steps depends on the logic and in my opinion can be performed once when requesting a sample. The third step depends on how much we want to make the questions different, to make a new generation before each topic, or once for all topics. I chose to generate only once, in my opinion this is enough. The fourth step will work most often and in this place it should be most time consuming when sampling. The last step is simple and just skip it.
')
Connect to the database
Consider the connection to the database. To do this, use an abstract class that will hold the connection string, Connection, and DataReader:
public abstract class SqlBase { protected static readonly String Connect; protected readonly MySqlConnection SqlConnection; protected MySqlDataReader SqlDataReader; static SqlBase() { Connect = String.Format("Database={0};Data Source={1};User ID={2};Password={3};CharSet=utf8", Settings.Default.Database, Settings.Default.DataSource, Settings.Default.UserId, Settings.Default.Password); } protected SqlBase() { try { this.SqlConnection = new MySqlConnection(Connect); this.SqlConnection.Open(); } catch (Exception ex) { throw new Exception(ex.Message, ex); } } }
I like abstract classes because they allow you to create and hide comfortable work logic in yourself, allowing you to concentrate fully on other concrete things in a derivative and not worry about other details, placing the responsibilities on the base class. In this case, creating an instance of a derived class, we automatically obtain the necessary conditions for working with the database and we need to implement the logic of the class itself to formulate a sample of questions:
public sealed class SqlPerceptionQuestions : SqlBase { public Dictionary<Int32, List<Question>> GetQuestion() { Generation(); GetTheme(); return Request(); } private Dictionary<Int32, List<Question>> Request() { var _collectionDictionary = new Dictionary<Int32, List<Question>>(); for(int ctr = 0; ctr < 12; ctr++) { using (var _questions = new SqlQuestions()) { if (ctr < 8) { _collectionDictionary[ctr] = _questions.GetQuestionSmall((Int16)ctr); } else { _collectionDictionary[ctr] = _questions.GetQuestionGreat((Int16)ctr); } } return _collectionDictionary; } private async void GetTheme() { // } // 40 private List<Question> GetQuestionSmall(Int16 numTheme) { var _listQuestions = new List<Question>(); for (Int16 numCard = 0; numCard < 40; numCard++) { _listQuestions.Add(GetQuestion(numCard, numTheme)); } return _listQuestions; } // 120 private List<Question> GetQuestionGreat(Int16 numTheme) { var _listQuestions = new List<Question>(); for (Int16 numQuestion = 0; numQuestion < 3; numQuestion++) for (int numCard = 0; numCard < 40; numCard++) { _listQuestions.Add(GetQuestion(numQuestion, numTheme, numQuestion)); } return _listQuestions; } // private Question GetQuestion(Int16 numCard, Int16 numTheme, Int16 numQuestion = 0) { // } // private List<String> GetResponse(Int32 questions_id) { // } }
This is the most simple and synchronous version, which runs for about 2 seconds and 200-400 milliseconds, which, accordingly, will block the user interface for all this time. This is already a good working version, since the very first implementation worked for quite a long time, about 6 seconds. After the improvement, it took about 2 seconds.
Creating an asynchronous version of the sample
Everything is good and everything is already working, but is it how it should be? After all, we have a synchronous method (blocking) and not a console application. A proper and completely working program is needed, which will not block even for half a second, but will work properly under any load. To do this, first rewrite the GetQuestion () method. Let's make it asynchronous in accordance with the TAP (Task-based Asynchronous Pattern) pattern. Who is interested in reading on the Internet or there is a pretty good book that I like very much - “Asynchronous programming in C # 5.0” by Alex Davis, where this subject is very well described, or
look here . Rewrite it and it will look like this:
public async Task<Dictionary<Int32, List<Question>>> GetQuestionAsync() { return await Task.Factory.StartNew(() => { Generation(); GetTheme(); return Request(); }, TaskCreationOptions.LongRunning); }
Consider the most interesting in this method: Task.Factory.StartNew (). Starting with the .NET 4.5 version, you can use the Task.Run () version, which differs from the previous one in a simpler declaration with a smaller number of parameters when created. In essence Task.Run (), it is simply a more convenient shell over Task.Factory.StartNew () and is very suitable for simply creating asynchronous tasks, but at the same time it has a little less control flexibility. If you need more precise control over which thread performs the calculations or how it is planned, use Task.Factory.StartNew (). If interested, take a
look here . In this case, I used this option for the reason that I also specified a parameter such as TaskCreationOptions.LongRunning, which marks this task as long and means that this work item will run for a long period of time and may block other work items . This flag also provides information for TaskScheduler, what to expect oversubscription, and this will create more threads than the number of available hardware threads. With this option, you can completely avoid ThreadPool, including global and local queues. Read more
"Task Scheduler" .
Thus, we get the asynchronous execution of the formation of a collection of data and do not block the main stream.
Sampling parallelization
After that, we proceed to parallelization, which will give us an increase in the speed of data sampling. But for this, you will have to redo a little main class, change the base class and add another one to ensure parallel cycle operation in order to avoid data races, critical sections and problems with variables for different streams.
To begin, create a new nested class in the
SqlPerceptionQuestions class -
SqlQuestions . We will transfer the methods to get questions and answers to them, and also make it a derivative of SqlBase, while leaving the outer class to get the topics, we will call it only once, as well as the formation of a sequence of numbers.
Class code
SqlQuestions :
internal sealed class SqlQuestions : SqlBase { internal List<Question> GetQuestionSmall(Int16 numTheme) { var _listQuestions = new List<Question>(); for (Int16 numCard = 0; numCard < 40; numCard++) { _listQuestions.Add(GetQuestion(numCard, numTheme)); } return _listQuestions; } internal List<Question> GetQuestionGreat(Int16 numTheme) { var _listQuestions = new List<Question>(); for (Int16 numQuestion = 0; numQuestion < 3; numQuestion++) for (int numCard = 0; numCard < 40; numCard++) { _listQuestions.Add(GetQuestion(numQuestion, numTheme, numQuestion)); } return _listQuestions; } private Question GetQuestion(Int16 numCard, Int16 numTheme, Int16 numQuestion = 0) {
For parallelization of the loop we will use Parallel.For (). This is a rather convenient way to organize data loading in several streams. But it is also fraught with the fact that we will need to create at least several connections to the database, since one Connection is able to process one DataReader. Rewrite the .Request () method:
private Dictionary<Int32, List<Question>> Request() { var _collectionDictionary = new Dictionary<Int32, List<Question>>(); var _po = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount }; Parallel.For(0, 12, _po, ctr => { using (var _questions = new SqlQuestions()) { if (ctr < 8) { _collectionDictionary[ctr] = _questions.GetQuestionSmall((Int16)ctr); } else { _collectionDictionary[ctr] = _questions.GetQuestionGreat((Int16)ctr); } } }); return _collectionDictionary; }
And after opening the connection, it must be closed. In this case, all this will happen in a cycle. To implement all this, I decided to create a separate derived class
SqlQuestions . To close the connection, we will call .Dispose (), in which we will write what we need to do when closing. To do this, first declare the .Dispose () method in the base class:
public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } protected abstract void Dispose(Boolean disposing);
And we will implement it differently in derived classes. Why is that? Let's start with what happens when you create a connection to the database and close it? If you open a connection and then close it, it is placed for a while (about 3 minutes) into the MySQL connection pool, and if you open a new one, the connection is taken from the pool, which means that time and resources for reopening are not used. Let's launch our new methods and see how much time it takes to open database connections, for this we insert in the base class Stopwatch into the code where the connection is opened and see what we have at the output. Code:
protected SqlBase() { Stopwatch st = new Stopwatch(); st.Start(); try { this.SqlConnection = new MySqlConnection(Connect); this.SqlConnection.Open(); } catch (MySqlException ex) { throw new Exception(ex.Message, ex); } st.Stop(); Debug.WriteLine(" : " + st.Elapsed.Seconds.ToString() + " " + st.Elapsed.Milliseconds.ToString() + " "); }
The first connection is the longest; it opens a connection for the
SqlPerceptionQuestions class, which will be open for the entire duration of the method. Subsequent connections are those that were opened while in a loop when creating instances of the
SqlQuestions class. Taking into account the number of processors on my computer, of which 4 we find that in a cycle the maximum will be open 4 connections. We get that 5 connections will initially be open, while in the cycle they will both open and close. Therefore, the first 5 connections take time to open, and after when the old connections are closed in the cycle and new ones are opened, time and resources will not go to them, as the connections are in the pool and they are simply issued each time they are required. Because of this, the cleaning of classes is a little differently implemented. In the
SqlPerceptionQuestions class
, the method will look like this:
protected override void Dispose(bool disposing) { if (!this.disposed) { if (SqlDataReader != null) { SqlDataReader.Close(); SqlDataReader.Dispose(); } SqlConnection.Close(); SqlConnection.Dispose(); MySqlConnection.ClearAllPools(); } disposed = true; }
And in the
SqlQuestions class
, the exact same, except for the string MySqlConnection.ClearAllPools (); After all, if we leave it, we get the following situation:
As we see, the constant cleaning of the thread pool leads to the constant opening of the connection with the ensuing consequences.
MySql connection pool
Consider a little more about this moment Connector / Net MySql supports connection pooling to improve the performance and scalability of application-intensive databases. This feature is enabled by default. You can turn it off or change the characteristics using the connection string options. The connection string supports options relative to the connection pool:
Option Name | Value by default | Details |
---|
Cache Server Properties, CacheServerProperties | Disabled | Specifies whether to update the parameters of some system variables (SHOW VARIABLES) every time a connection from the pool is returned. Enabling this option speeds up connection to the connection pool context. Your application is not informed of configuration changes made by other connections. This option has been added since Connector / Net 6.3. |
Connection Lifetime, ConnectionLifeTime | 0 | When a connection is returned to the pool, its creation time is compared with the current time and, if it exceeds the value of ConnectionLifeTime, the connection is destroyed. This is useful in cluster configurations for load balancing between a production server and a server that is online. A value of zero (0) makes the connections in the pool stand by for the longest possible time. |
Connection Reset, ConnectionReset | false | If true, the connection status is removed from the pool. The default value avoids the additional processing cycle by the server to receive the connection, but the connection status is not reset. |
Maximum Pool Size, Max Pool Size, MaximumPoolsize, maxpoolsize | 100 | The maximum number of connections allowed to be in the pool. |
Minimum Pool Size, Min Pool Size, Minimum Size Size, minpoolsize | 0 | The minimum number of connections allowed to be in the pool. |
Pooling | true | If true, then the MySqlConnection object is taken from the pool, if necessary, it is created and added to the corresponding pool. Specific values: true, false, yes and no. |
Using these parameters, you can manage the pool as needed. For example:
- Connection Reset resets the connection context and if it is used by default, then the time to receive a connection from the pool is slightly increased.
- Minimum Pool Size allows you to set, if necessary, the number of connections in the pool that will exist indefinitely, if the number of connections is less than or equal to the Minimum Pool Size value.
- Disabling Pooling will produce the same results if you clear the thread pool all the time (as in the example of this program)
By default, starting with MySQL Connector / Net 6.2, there is a background job that runs every three minutes and removes connections from the pool that have been in standby mode (not used) for more than three minutes. Cleaning the pool frees up resources both on the client and on the server side. This is because on the client side each connection uses a socket, and on the server side each connection uses a socket and a stream.
Parallel call stack
For the sake of interest, if you put a breakpoint, for example, in the .GetQuestion () method and see the parallel call stack, you will see:
As can be seen from the screenshot, we are in one of the threads that is suspended, and by the call stack we determine that this method was called from the method that loads a small collection of questions (40). There are 3 more streams to the left of it, and two of them are stopped at this moment on the line of adding a question to the collection, which also process a small collection of questions. And the last of them, 4th thread, deals with processing and receives an answer at this moment to a question, only for a question from a large collection (120). All these 4 threads were created in a parallel loop and work almost simultaneously with the rest of the iterations of the loop. These threads are included in the total number of threads in the program, of which 8, where the remaining 4 solve other tasks of the program.
The final touch is exception handling.
And finally, for the program to work, we need exception handling. Suddenly the fields or some other parameters changed from the database or an unexpected error occurred in the program itself. Rewrite the GetQuestionAsync () method:
public async Task<Dictionary<Int32, List<Question>>> GetQuestionAsync() { return await Task.Factory.StartNew(() => { try { Generation(); GetTheme(); return Request(); } catch (AggregateException ex) { throw new AggregateException(ex.Message, ex.InnerExceptions); } catch (Exception ex) { throw new AggregateException(ex.Message, ex.InnerException); } }, TaskCreationOptions.LongRunning); }
Exception handling AggregateException is due to the fact that the Parallel.For loop, if an exception occurs, will raise this type of error and, therefore, it must be processed and passed to the caller. It is quite logical that a parallel loop generates an error of this kind. Consider this moment in more detail: for this, I changed the Sql query in .GetQuestion (), knowingly incorrectly specifying one of the parameters that does not exist in the database table. We get:
Moreover, if we continue debugging, then the whole exception will occur 4 times, which is quite logical. In order to handle all 4, even if they are related to the same reason, it is necessary to arrange them somehow, for which the AggregateException exception is suitable.
Exception handling is related to the fact that if an exception occurs in the .GetTheme () method, then one exception will be thrown there and it should also be caught.
The calling code is arranged as follows:
private async void Button_Click(object sender, RoutedEventArgs e) { Stopwatch st = new Stopwatch(); st.Start(); try { SqlQuest = new SqlPerceptionQuestions(); collectionQuest = await SqlQuest.GetQuestionAsync(); } catch (AggregateException ex) { ex.ShowError(); } catch(Exception ex) { ex.ShowError(); } finally { if (SqlQuest != null) SqlQuest.Dispose(); } st.Stop(); Debug.WriteLine(" : " + st.Elapsed.Seconds.ToString() + " " + st.Elapsed.Milliseconds.ToString() + " "); Debugger.Break(); }
Finally ...
In general, it started with the fact that when I wrote a parallel version of the cycle, I thought, how often the frequent opening and closing of the database connection affects. Visiting forums and asking smart people I learned about the pool of connections. Then I asked myself a question: is it necessary to create it or is it being created in an implicit form? I experimented a bit and read the MySQL documentation, and as a result I came to the conclusion that this is a very good thing, a pool of connections!
All code after changes // public abstract class SqlBase : IDisposable { protected static readonly String Connect; protected readonly MySqlConnection SqlConnection; protected MySqlDataReader SqlDataReader; protected Boolean disposed; static SqlBase() { Connect = String.Format("Database={0};Data Source={1};User ID={2};Password={3};CharSet=utf8;CacheServerProperties=true", Settings.Default.Database, Settings.Default.DataSource, Settings.Default.UserId, Settings.Default.Password); } protected SqlBase() { Stopwatch st = new Stopwatch(); st.Start(); try { this.SqlConnection = new MySqlConnection(Connect); this.SqlConnection.Open(); } catch (Exception ex) { throw new Exception(ex.Message, ex); } st.Stop(); Debug.WriteLine(" : " + st.Elapsed.Seconds.ToString() + " " + st.Elapsed.Milliseconds.ToString() + " "); } ~SqlBase() { Dispose(false); } public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } protected abstract void Dispose(Boolean disposing); } // public sealed class SqlPerceptionQuestions : SqlBase { public async Task<Dictionary<Int32, List<Question>>> GetQuestionAsync() { return await Task.Factory.StartNew(() => { try { Generation(); GetTheme(); return Request(); } catch (AggregateException ex) { throw new AggregateException(ex.Message, ex.InnerExceptions); } catch (Exception ex) { throw new AggregateException(ex.Message, ex.InnerException); } }, TaskCreationOptions.LongRunning); } private Dictionary<Int32, List<Question>> Request() { var _collectionDictionary = new Dictionary<Int32, List<Question>>(); var _po = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount }; Parallel.For(0, 12, _po, ctr => { using (var _questions = new SqlQuestions()) { if (ctr < 8) { _collectionDictionary[ctr] = _questions.GetQuestionSmall((Int16)ctr); } else { _collectionDictionary[ctr] = _questions.GetQuestionGreat((Int16)ctr); } } }); return _collectionDictionary; } private void GetTheme() { } private void Generation() { } protected override void Dispose(bool disposing) { if (!this.disposed) { if (SqlDataReader != null) { SqlDataReader.Close(); SqlDataReader.Dispose(); } SqlConnection.Close(); SqlConnection.Dispose(); MySqlConnection.ClearAllPools(); } disposed = true; } // internal sealed class SqlQuestions : SqlBase { internal List<Question> GetQuestionSmall(Int16 numTheme) { var _listQuestions = new List<Question>(); for (Int16 numCard = 0; numCard < 40; numCard++) { _listQuestions.Add(GetQuestion(numCard, numTheme)); } return _listQuestions; } internal List<Question> GetQuestionGreat(Int16 numTheme) { var _listQuestions = new List<Question>(); for (Int16 numQuestion = 0; numQuestion < 3; numQuestion++) for (int numCard = 0; numCard < 40; numCard++) { _listQuestions.Add(GetQuestion(numQuestion, numTheme, numQuestion)); } return _listQuestions; } private Question GetQuestion(Int16 numCard, Int16 numTheme, Int16 numQuestion = 0) { } private List<String> GetResponse(Int32 questions_id) { } protected override void Dispose(bool disposing) { if (!this.disposed) { if (SqlDataReader != null) { SqlDataReader.Close(); SqlDataReader.Dispose(); } SqlConnection.Close(); SqlConnection.Dispose(); } disposed = true; } } }