📜 ⬆️ ⬇️

We write a simple file parser (for beginners)

In this article I would like to tell you how to write a simple parser using the example of aimp.ru and geekbrains.ru sites. The article is intended strictly for those who already have a basic knowledge of the C # programming language and have already written their first "Hello world".

I always liked the Aimp audio player (no, this is not an advertisement), but it has too few built-in skins, and there was no desire to go to the site, watch skins, download and try how they would look. So I decided to write a skin parser from this site. Having looked at the site a bit, I noticed that the skins are stored there sequentially with the assigned id. Since until recently, I only knew 1C and a bit of command line, without thinking twice, I decided to write it on the command line. But when testing, I found out that if you download a large number of files, firstly, the part may simply not be downloaded, and secondly, there may be an overflow of RAM. In the end, I then threw this venture.

Having started studying C # not long ago, I decided to return to this idea in order to practice a bit. What came out of this read under the cut.

For development we need only a development environment, I used Visual Studio , you can use any other to your taste.
')
I will not delve into the basic concepts of C #, for this a lot of different books have been written and countless clips have been shot.

First of all, let's start Visual Studio and create a console application (since I am too lazy to make forms, we do not need an interface). The development environment itself will prepare a project template. We will have something like this:

using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; namespace ConsoleApplication2 { class Program { static void Main(string[] args) { } } } 

Remove the directives that we will not use now:

 using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; 

And we add those directives that we will use:

 using System.Diagnostics; // ,     using System.Net; // ,    Web using System.Threading; // ,    

Then in the Main method we declare a variable:

 WebClient wc = new WebClient(); //       Web 

Parsim aimp skins


Next, we write the function itself:

 static string DownloadSkinsForAimp(WebClient wc) { Console.WriteLine("Downloading began"); try { for (int i = 0; i <= 5; i++) { string id = "79" + i; string name = GetNameOfSkin(wc, id); // ,     string path = "aimp.ru/index.php?do=download&sub=catalog&id=" + id; try { //             Process.Start("chrome.exe", path); Console.WriteLine("Download " + name + " is succesfull!"); //  5 ,    ,      Thread.Sleep(5000); } catch { Console.WriteLine("Download" + name + "failed"); } } } catch { return "\nSomething went is wrong"; } return "\nDownloading complete"; } 

All skins are downloaded to the directory specified in the browser settings. We need try / catch constructs so that the program does not “fall out” due to errors. Although it was possible to do without them.

You may have noticed the GetNameOfSkin function. It is needed in order to get the name of the skin that we download. You can do without it, it is needed only for beauty, but since we only learn, we will write it too:

  static string GetNameOfSkin(WebClient wc, string id) { //    html  string html = wc.DownloadString("http://www.aimp.ru/index.php?do=catalog&rec_id=" + id); //        id      //     5    id string rightPartOfHtml = html.Substring(html.IndexOf(id) + 5); //         string name = rightPartOfHtml.Substring(0, rightPartOfHtml.IndexOf("<")).Replace(" ", "_"); //         return name; } 

Then in the Main method you need to call download for execution:

 Console.WriteLine(DownloadSkinsForAimp(wc)); //     ,  //        

Parsim Geekbrains certificates


Certificates on the site are stored in clear form, and opening them through the site as aimp skins, we can download them only manually by clicking the download button. But this is not the case, we are programmers.

This is where the WebClient class comes to our rescue, namely its DownloadFile method. We just give him the path to download and save path and he does everything for us. It sounds easy, try to do:

 static string DownloadCertificates(WebClient wc) { //   ,      string currentUser = Environment.UserName; Console.WriteLine("Downloading began"); try { for (int i = 0; i <= 5; i++) { try { //         '.pdf' wc.DownloadFile("https://geekbrains.ru//certificates//7075" + i + ".pdf", "c:\\users\\" + currentUser + "\\downloads\\7075" + i + ".pdf"); Console.WriteLine("Download certificate №7075" + i + " is succesfull"); } catch { Console.WriteLine("Download certificate №7075" + i + " is failed"); } } } catch { return "\nSomething went is wrong"; } return "\nDownloading certificates are complite!"; } 

And after that, just call this function from the Main method.

In general, both of these functions still have to be improved, but I think they are quite suitable for familiarization and the most basic functions of parsing. To whom laziness to collect all this in one project - here the link to GitHub .

Thank you for your attention and I hope this will help someone.

PS: Certificates with geekbrains you can download and change the owner's name and surname to his admire them.

PPS: All skins downloaded from the Aimp website are stored in the '.zip' format and, if desired, the function can be further developed so that it unarchives them. You can also add that they are immediately transferred to the folder with Aimp skins.

PPPS: The article is informative only and does not carry an advertising character.

Source: https://habr.com/ru/post/282119/


All Articles