1. Introduction
This text is a small summary of my experience in applying for a Computer Science PhD with a bias in machine learning in North America. I tried to collect my miscalculations in this guide (to learn better from the mistakes of others) and more or less universal things that are useful to everyone. But you still need to understand that this is quite an individual experience, so your personal strategy may differ. For example, in the case of selecting universities / academic leaders or in writing the statement of purpose. Well, or you are in other starting conditions in comparison with me (assessments, articles, recommendations).
Keep in mind that the main part of the guide was written before getting the results, because I wanted to avoid the “ survivorship bias error” and analyze my experience regardless of whether I entered or not. At the end of the manual there are my results: I enrolled in 2 of the 11 universities that I applied to. In my opinion, you should still avoid the mistakes that I will describe here. Well, you need to understand that in the process of submitting to ML PhD there is a lot of noise, so you can do everything well and fly, and probably even vice versa.

Be prepared that applying for a PhD will take you from two months to six months, depending on your starting level and ways of organizing work. I got around two months and it was stressful. If you do not have scientific articles, it may make sense to put a year or two on writing them. For money: $ 400 (GRE + TOEFL) + $ 70-150 for each submitted application + $ 150 (preparation for GRE / TOEFL through Magoosh). Please note that these numbers are relevant at the end of 2017.
The PhD submission process is briefly structured as follows: you prepare and submit to GRE / TOEFL, select a university and a researcher, write a statement of purpose / personal history, write to potential scientific leaders, fill out applications, wait for answers, pass interviews (in some cases, take ), enter and saw awesome research and you are taken as a professor at Stanford or a researcher at Google (but this is not accurate). Each chapter of this guide describes one of the parts of the process above. At the end of each chapter, I also collected useful links that I stumbled upon during the preparation process, because my experience is not the first and not the last.
2. Why do I need a PhD?
This is the main question that you need to answer yourself before getting involved in all this. Submission of applications is worth the time, money and, most importantly, nerves. Yes, in the process you will understand something about yourself and become a little better aware of how research is organized in this area, but you can get this knowledge under less stressful conditions and do much more useful things during this time.

In my opinion, the normal answer to the question “Why do I need a PhD” is only one - you want to do research in this area. If you look at phd as a way to get into Google / Facebook / Amazon, then there are a lot of other more reliable and interesting ways. It should be understood that phd takes from 4 to 6 years and during this time it is quite possible to build a normal career as a data scientist or data engineer. Moreover, if your phd goes wrong, you will find yourself in a very losing position compared to the people who worked while you were suffering from phd.
In fact, phd is a license to do research (but not the only way to do it). If you do not know what to do with this license further, then it is better not to get involved in it.
3. Country selection
Initially this item was not in this guide, but I decided to add it because of the visa situation. The harsh truth of life is that in the current geopolitical situation (2018), many foreigners have become more difficult to obtain study visas in the United States, especially if they are engaged in dual-use technologies: atomic physics, computer science, chemistry, and so on. Almost 100% of that, when you apply for a visa, you will be taken to a thing called administrative processing, which used to take about three weeks, and now it can take three months or more.
The second problem with the American study visa is that it is likely that they will give you just a year. This means that you either get stuck in the United States (you can stay there without a visa if you are fine with internal documents), or you have to renew your visa every year if you want to go to conferences outside the United States (the visa gives you the right to enter country, but not to stay). If geographic mobility is important to you or you want to regularly visit relatives, you should seriously consider submitting documents not in the USA, for example, to Canada or Europe.
It is also important to understand the particularities of countries in their approaches to PhD. In Europe, PhD usually requires a master’s degree and lasts for 3-4 years, during which time you are working on a specific project. In Canada and the United States, people usually enter graduate school after undergraduate studies, undergo the first two years of training, choose a supervisor with the topic and end up defending 5-6 years after they start their studies. You can enroll in a PhD in the United States and with a master’s degree, but this is not the main thing that most universities look at first.
useful linksComment on the situation with PhD applications from foreigners in the US
4. Cost estimate
This mainly concerns American / Canadian universities, which almost all require paying an application fee ($ 70-125 for a university), and also send them official GRE reports ($ 27 GRE + $ 19 TOEFL). As a result, it turns out that one application to a university costs $ 100-150 dollars. There are still fixed costumes for GRE and TOEFL - about $ 200 each. In other words, if you want to go to 10 American universities, it will cost you about $ 2000. The calculation is current at the end of 2017.
The second important component of spending is time. It took me about two months: one to prepare for the GRE and the other to search for scientific leaders, writing the statement of purpose and filling out applications. In my opinion, this is an absolute minimum, below which you should not fall. This is not a pure time, because I was working in parallel in a scientific laboratory, so if you have more free time, you may be able to cope faster. If you are a supporter of minimum stress, then it is better to start 3-6 months before the deadline for applications.
5. Preparation for GRE
5.1 General
GRE is a 3-hour, 45-minute test that tests your knowledge of numerical methods (quant, Q), your ability to analyze texts / sentences in combination with vocabulary (verbal, V), and the ability to write analytical texts (AWA). About the test itself is written in detail in a bunch of places, so here I will share my impressions and tricks.

With GRE, the whole foolish story, in my opinion. If you write it very well, it does not give a particular advantage, because most of the strong candidates write it well. But if you write it badly, it can be very harmful. This makes preparing for it a dreary and tedious task, since such a statement of the question does not motivate at all (one must struggle to stay in place). I used a few mental tricks to make this tedious process more enjoyable / effective.
')
Set yourself a goal. My goal was 165Q, 155V. I did not set the AWA goal and it was a mistake. As a result, I passed on 169Q, 159V and 3.0 AWA, where the first two grades are very good for my specialty (96 and 83%), and the last one is very mediocre (18%). If I set a specific goal for the AWA, my training would be more effective.
See GRE as an opportunity to learn something. In the case of mathematics, I refreshed some school knowledge + learned a few evaluative tricks. In the case of c verbal, I significantly expanded the vocabulary and learned some words that I would never have learned otherwise. Without this trick, preparing for GRE is terribly boring.
Understand the test meta. The questions in GRE are not always formulated as clearly as possible and this is done on purpose. The compilers are well aware of the conditions in which you decide the test and within the rules sometimes try to confuse you. It is necessary to understand how these traps are arranged in order not to fall into them. This is a very useful magoosh (see below).
Use www.magoosh.com. A six-month subscription costs $ 150 and is worth it. On Magoosh, a bunch of short and clear videos that explain how GRE works, the basic tricks and traps of compilers, and also help you to refresh the math that you forgot. Plus, there are about a thousand tasks on quant and verbal, as well as convenient and clear statistics and a way to keep track of the sections in which you are most mistaken.
Estimate the time you need to prepare. Rule of thumb, which is written everywhere and with which I agree, takes an average of 40 hours to improve a score in a category (for example, quant) by 5 points. For example, if you wrote a test for 160Q / 155V for the first time, then you need 80 hours to raise the score to 165Q / 160V. But here it is important to understand your individual characteristics. For example, if you are sure that your scores are low because of nerves, then you may need less / more time to develop your test writing strategy.
Set your training routine according to your priorities and available time. I had exactly one month to prepare and therefore in my case the routine was 40 quant questions and 40 verbal questions daily. I didn’t have AWA routine and it was a mistake.
5.2 Quant
It is important to understand that GRE Quant is more likely a test not only of knowledge of basic mathematics, but also of attention with concentration. At the beginning of the training, I assessed myself on these three points (excellent / norms / bad) and built the preparation accordingly. In my case, the mathematics was excellent, the attention was bad and the concentration was excellent. By concentration can be understood the ability to work under tight temporary pressure.

Every day I solved at least 40 questions with magoosh in quiz mode - this is when you answer questions and only then you see the answers. I would never use practice mode when you see the correct answer right after your answer. Preparation in the quiz format is more similar in terms of conditions to the real test. Plus, analyzing errors is easier and better than a bunch.
In addition, in the process of writing this text I was advised to
Crunchprep - it is argued that they are also convenient to use and you can see what to pull up.
5.3 Verbal
GRE Verbal is primarily about vocabulary, and secondly about understanding how the most common traps in reading tasks are arranged. To pass Verbal normally it is enough to watch thoughtfully all the videos on the magoosh about verbal (there are fewer of them than in mathematics) + constantly work on a dictionary. The quizzlet.com site (there is also memrise.com) helped me a lot with the latter, where you can make lists of words, and then start training, where the site gives you a tricky way to learn them. I made a habit of writing there all the unfamiliar words that I encountered in matters of magoosh and texts that I read. I wrote down the words in packs of 50 pieces and at the end of the training I tried to work out one pack every 2-3 days. With reading in my opinion, it is enough to solve all the related questions on the magoosh. The most important trick that I pulled out is that you first need to read the question, then formulate your answer and only at the end look at the answer options.
5.4 AWA
With this part, I screwed up a bit, because I got 3.0 from 7.0, which is pretty bad. The ideal preparation option, as I understand it after the fact, is to find people who can give a feedback on a letter and write 3-4 essays a week. The main problem with AWA for me was that it was difficult to write meaningful things under tight temporary pressure. Magoosh offers a good scheme: intro, 3-4 paragraphs with theses, conclusion. It was useful to me, because it allows us not to think about the structure, but to focus on the content.
In the process of writing this text, I was also advised
that this resource , which gives a rough estimate for the essay in semi-automatic mode.
5.5 Skill of passing the test itself
In order to properly pass the GRE in my opinion, it is very important to reduce the level of stress during its delivery. For example, be familiar with the test interface. In addition, it is very important to properly manage time. For example, do not hang on difficult issues and return to them in the time remaining. For this, I recommend passing as many mockup tests as possible (this option is on magoosh, and you can find a list of free tests
here ). In addition, GRE offers two powerprep doughs when booking a time to pass. They must be handed over to get an idea of ​​the interface.
Personally, in the last 10 days of training, I passed six tests: two PowerPrep and four Magoosh. It helped me a lot when taking the test itself. For example, I got into the quant section a very cleverly formulated question about probabilities, on which I paused. But since I had experience of surrender, I missed this question, then with a calm soul I returned to it at the end and it turned out that the question was simple, just formulated with a trick.
5.6 Booking Time
The latest GRE and TOEFL handing over time is the first week of November if you want to make only one attempt. If you want more, add a month to each additional attempt for GRE. October / November is the hottest time to pass, so it's best to book at least one month in advance, or even earlier, to get a test at a convenient time of day.
For example, I owl and originally booked a test at 8 am, since I booked at the last moment. I then had to monitor a convenient time and spend $ 50 to change the time to take the test at four o'clock in the afternoon. After the fact, I think it was a very correct decision, because I passed a simpler TOEFL at 8 in the morning and felt that the brain was not very involved. If you are a lark, then it is possible for you that this is exactly the opposite.
5.7 Retake GRE / TOEFL
If you are not confident in your abilities, schedule tests so that you have time for one or two retake. GRE you can take five times a year with a minimum interval of 21 days, TOEFL you can retake as you like with an interval of 12 days. In practice, this means that it is better to add a month to each GRE retake attempt and two weeks to the TOEFL.
6. TOEFL Preparation
TOEFL consists of four parts: speaking, writing, listening, reading. For each of them you can get a maximum of 30 points. As a rule, universities require that your result be not below a certain threshold, most often 80 or 100. Some universities indicate section minimums. For example, I did not submit to Cornell, because they had 22 cutting off (I turned out 20). In general, speaking is usually the most important part, if the uni has a separate spacing, so she should pay special attention (see below).

If you prepared for Ver Ver GRE and AWA normally, then you are also ready for reading / writing because they are simplified versions of Verbal GRE. Listening, too, should not be a problem if you are able to watch TV shows / movies without subtitles and understand most of what is happening there. If not, this is a good way to prepare. The main difficulty with listening during the test is that several people in the room pass the test, so you can listen when someone else speaks speaking. To this we must be mentally prepared and not shake.
The most difficult part for me was speaking. I thought that by default I was ready for it, but the test turned out to be an important nuance - a time limit. You have 45-60 seconds, and sometimes even less to clearly answer the question. This requires some practice. Magoosh has a toefl preparation service ($ 50 per month). I bought, but in fact almost did not use. If I was preparing for the test now, I would certainly have worked a few dozen questions of speaking.
7. Grades at university (s)
There are two important components: undergrad (bachelor / specialty) and graduate (magistracy). Requirements are estimated to vary from institution to institution. Someone is interested in your grades only in undergrad entirely, someone is interested in the last two years, including the magistracy (if you were in it). In my case, I was rather in a bad position - I had very bad grades, despite the fact that I graduated from a very good university on a very good program.
Depending on the institution of higher education and the program, high grades will increase your chance to pass the initial selection, but they most likely will not affect the final decision. Bad grades reduce the likelihood that you will pass pre-filters and make your profile a bit less competitive: you will have so many competitors with a GPA close to perfect. At the same time, judging by what I read, there is not much difference between GPA 3.8 and 4.0. According to my feelings, if you have strong other parts of the application, then the GPA> 3.5 is quite normal.

Here I walked the path of minimizing damage - if you have a good reason why the ratings were bad, then it is worth mentioning it in the statement of purpose, but without fanaticism and in a positive way. In addition, if you have academic referees who have taught you, you can ask them to write something like “his grades in bachelor’s degree sucks, but this is complete nonsense.” It will work or not - it depends on the university and the program, but this is not something that you can greatly influence, so you shouldn’t strain yourself on this topic (although I still strained).
If you have bad grades, then it is doubly important for you to pull in GRE well and be very sensible in choosing universities where you are applying. For example, I did not submit to MIT because they are known for the fact that GPA is very important for them. And the same MIT directly writes that GRE means nothing to them. Probably, you can get into MIT with bad grades, it's just that the probability is not very high, and my task was to maximize the probability of getting into the PCD, provided that I like the potential university and scientific leaders. A little more about it in the paragraph about the choice of the university and the potential supervisor.
8. Recommendations
For most universities, you will need 2-3 recommendations of teachers / scientific leaders / people who know you from a scientific or labor point of view. And then there are two problems - how to find such people and what they should write there.
8.1 How to choose a referee
Since you are applying for a research position, ideally, recommendations should come from researchers in your area of ​​interest who talk about your ability to work as independent researchers. I would aim for at least two recommendations from an academic environment. The status of the recommender is also important - if he is known, then the chance is higher that his recommendations will be heeded.

Since most of us do not have the opportunity to receive a recommendation from Benjio, Hinton or Lekun, there are several possible sources of recommendations. Firstly, the diploma supervisor is a practically obligatory option, especially if you studied in a magistracy. Secondly, someone from the dean's office who knows you well and treats you well. Thirdly, if you have done interesting research projects or summer practice, then the project / practice manager will do. Fourth, your immediate supervisor at work, if you have been working for a long time somewhere and are proud of what you have done there.
The general principle when choosing a referrer is that it is better for you to write a good recommendation to a less status person who treats you well than a faceless one - a status one. The ideal option is both that and that, but in this case, you most likely do not need this manual.
8.2 How to write recommendations
There is a chance that the recommender will ask you to write a recommendation for him so that he can then edit it. This is a strange experience, because on the one hand, I want to write about myself well, on the other hand - objectively. Since I did not write recommendations for myself, I can give some general advice.
Avoid recommendations like did well in class. Universities receive such recommendations in the thousands and they are useless. If the recommendation is written by a person who taught you a course, let him write in more detail what you are so good at, what an interesting project you have done and how cool you are among those he has taught in his life.
The recommendation should demonstrate how independent you are and capable of research. Professors are usually terribly busy people, so they value those who require less of their precious time. If the recommendation shows that you are able to do your own research (but do not disappear!) - this is a good sign. Ideally, there should be concrete examples of projects and what you have done in them.
From the recommendation it should be at least approximately clear how pleasant you are. If you are a genius and unique, then perhaps this part is not very important, but if not, then all other things being equal will give preference to the one with whom it is pleasant to work. You don’t have to write a lot about it, but if it’s clear from the recommendation that you are a nice and decent person, it’s definitely not superfluous.
8.3 How to make life easier for referees
The steeper your recommenders are, the worse their free time is.
Your task is to make the process of writing and submitting recommendations as painless as possible for them. Personally, I made a table for them in Google Sheets, where I pointed out all the universities where they went, their deadlines and recommendation status (a request was sent / not sent, a recommendation was received / not received, and does this person need a recommendation at all). It will not be superfluous to send a reminder to recommenders when the earliest deadline approaches that in X weeks / days the first deadline will occur.9. Articles
This is a very important part of the application, because the main part of the scientific work is writing articles. Even if you do not plan to do science after the PhD, you will have to write articles during your studies and also defend your dissertation. To understand what you are going to, it would be good to try to do it in advance in order to understand how interesting it is to you.
If you are a bachelor or master student, everything is relatively simple - you are looking for people and laboratories at the university who are doing what you are interested in and try to do research projects with them. Do not hesitate to contact professors - everyone needs hard-working students and extra hands (especially free ones). At the same time, you should not overstate the bar too high and mark immediately at NIPS or ICLR, but it would be nice to go to English-language conferences or workshops. Even if your article was not accepted, but you like it, post it on arXiv - this is better than nothing. Nobody expects articles from you on NIPS - it is very difficult and one of the goals of your postgraduate studies is to learn how to write such articles.If you have worked in the industry and do not want to go to the magistracy, and immediately want to immediately enter the PhD, then it is more difficult. Here I can offer only one recipe - get a research assistant in the laboratory and take part in several projects.I was lucky and I got a job in a laboratory in the USA that deals with neuroscience and I wrote a few of my articles there. If you think that my story is unique, then here is an example of a person who worked in the United States as a very highly paid lawyer, and then entered NYU after having worked a year before in a science lab. The moral of this story is: even if you have achieved a lot in the old profession / industry, you will most likely have to sacrifice time / money for PhD.True, there is another problem in that there are not so many groups and universities in Russia that are doing world-class ML-research. In my opinion there are four of them: HSE, MIPT, Moscow State University and Skoltech. I do not want to advise on specific names, but it is quite easy to find people in these universities, which have publications in international conferences. How to get into such a group is a separate question and here I unfortunately cannot advise anything.Finally, another way to gain research experience is to reproduce some famous article from scratch. This will allow you to understand how you are able to do what the authors of the article did. Moreover, the ICLR has a Reproducibility Challenge.in which the organizers urge to reproduce articles from the previous year. This is also a good way to show that you are able to do research in this area, and also get a quasi-publication for a PhD application.10. The choice of university and supervisor
10.1 General
In theory, this section should go right after “why should I PhD”, but it stands at the end for one simple reason. In the United States and Canada, a huge number of good universities and even more good professors. In order to view them thoughtfully, you need a lot of time. The points above (GRE, articles, TOEFL, GPA) impose restrictions on your choice of universities. For example, if your grades are so-so, then MIT-type high schools are most likely closed for you. Or, for example, your GRE does not reach the officially specified threshold (some universities indicate this). This means that if you put off the choice of universities at the end, you can save time by using your results as additional filters.In my opinion, before starting preparation for PhD, it is worth choosing a few dream school - places where you want to go regardless of the chances, just to try. After you pass the tests, you can add a few more realistic candidates to this list based on your results.It is also important to understand that in the USA and Canada there are a lot of good universities, of which you most likely know only 5-10 of the most famous (for example, Stanford, Berkeley, Harvard, Yale, Carnegie Mellon, MIT and Caltech). It is very difficult to get to these universities, because everyone knows them and every year a huge number of people are served there. Personally, I was guided by getting into college from the top 50.10.2 Search for a supervisor
For myself, I decided that the school rating is not very important to me - there are a lot of ranks (QS, TIMES, US NEWS and so on), they can differ and often it is not very clear how they are composed. Therefore, in the first place I was looking for professors who are engaged in research that is interesting to me and look like pleasant people. The last part should not be underestimated - you will spend several years with your supervisor, and if you dislike it from the very beginning, it is unlikely to be a pleasant time.
To search for scientists, I used CSrankings.org- a convenient and minimalistic site in which you can choose various CS / AI / ML directions and watch universities sorted by the number of publications in leading conferences in these areas. What is even more valuable, for each university is a breakdown by professors citing. Actually, I just chose the directions of interest to me, took a period in the last five years and went on the list of people from each uni. As a rule, I filtered professors who have less than 10 publications, because I was looking for people who are actively working.For each professor, I evaluated three things. The first is a google scholar profile. There I looked not only the most cited articles, but also the breadth of interests of the professor, as well as his latest articles. I tried to avoid too narrow or too wide specialists, as well as pure theorists (there are quite a few) and clean applied ones (there are few such, because applied articles are more difficult to publish). I was looking for people who are fundamentally strong and use this knowledge to solve applied problems. This eliminated about half of the professors (very subjectively).The second is a personal site. This is the best (albeit rather imperfect) of the possible approximations of the professor’s personalities, if you are unfamiliar with him. According to my observations, good professors have a site that is not overloaded with regalia or Ponte, it clearly states what a person is doing at all now, the key is highlighted from the publications and ideally there are notes for potential students. In addition, the site often write, take students or not. Of the things that put me on my guard: an abundance of Ponts and / or regalia (you are a professor, it is clear that you are cool), the lack of updates, the lack of students, or the small number of them.The third is social networks. This is an optional thing, but live twitter / facebook is a big plus for the professor. By him you can understand how he thinks, what things interest him and what kind of person he is. There are not many such professors, but I think that over the years there will be more of them, so this advice will be all the more relevant.It is important to understand that my method of choosing a scientist is greatly biased towards the tough guys. If a professor actively publishes in the best conferences, chances are good that he works in a good university, which is more difficult to enter. On the other hand, if you don’t like a potential supervisor, even on paper, then there’s a chance that you’ll have a hard time with him.10.3 Choice of university
Since we live in a non-ideal world, it may happen that the ideal scientist finds himself in a non-ideal university. This is either the location or the criteria for admission, or the scientist simply does not take students this year. Therefore, after filtering the scientists, I filtered universities. The criteria were as follows.The number of potential scientists. I did not apply to universities, where I could not find at least three potential leaders that I liked. This is a question of maximizing return on invested resources - you pay money for each application, so it’s risky to rely on one supervisor. Plus, many universities ask you to list three potential leaders.Compliance with the selection criteria for my parameters.For example, I had not very high speaking in TOEFL - 20 and by this criterion Cornell was closed for me. Other universities, such as MIT, look at the GPA very meticulously. Third Universals give a GRE cutoff, explicit or implicit. With an explicit, everything is clear, and an implicit one is usually manifested in the fact that the university gives points received there for different years (for example, for Duke University). If your scores are significantly lower, then it is worth considering.Funding options.Most universities write how they finance their PhD students. This is usually a teaching / research assistant job. If this is not clearly indicated on the website of the university, it can be a disturbing sign, because there is a chance that you will have problems with financing. Well, that is, they can take you, but without funding, which for me personally was tantamount to refusal, because graduate school in the USA, as well as all education in general, is very expensive.How many universities you apply depends on the time and money you have, as well as the relative strength of your application. If you think that you have a strong application for those universities where you are going to apply, then you can go to a small number of universities (<7), if the application is relatively weak, then you may want to expand the network wider. It is important to understand that your assessment of the relative strength of a resume may be overstated, so it is worth to make sure.I know several people who were served either at the same time with me or a year earlier. The first from the USA, with a very strong resume, moved to ~ 10 steep universities, of which he was taken in more than half and he is now at Stanford. The second from Russia, with not very good grades in undergrad, went to five universities for six programs, of which he was sent to two universities, one of which in the top 10 US News. The third is from China, which has moved to ~ 20 places, of which she was taken to one or two universities, and she eventually went to the university from the top 25. All of them were submitted to biomedical engineering.Personally, I went to 11 (8 in the US, 2 in Canada, 1 in Europe) at Computer Science, nine of which demanded an application fee. In my opinion, more is overkill. Each university requires filling out an application (and usually the forms for filling in applications are different), so expect that it only takes about two hours to fill out one application (this is registration on the site, filling in numerous fields, checking information), so multiplying the number of universities linearly multiplies this time.11. Writing statement of purpose / personal history
Statement of Purpose (SoP) is a two-page text about who you are, why you need a PhD, what you want to do and what relevant experience you have. Already from this description it is clear that the main problem of SoP is to shove a huge amount of information into a very compressed amount of text. Depending on your profile, aspirations and character, you will have to sacrifice some parts and write more about others.
Estimates of the role of statement of purpose (SoP) vary greatly. Some guides say that this is the most important part of an application for a PhD, others - that this is a more or less formal part (after all, someone can write it for a candidate). In my opinion, the role of SoP grows, if you do not have the most ideal profile and you are not a bachelor / specialist at the time of submission. Personally, I spent a lot of time writing it and formulated several important principles for myself, which are listed below. Important note: once again I remind you that this guide is very individual, and this part is doubly individual. There is a chance that you will come to some kind of scheme.
Rewrite the SoP over and over. I am still ashamed of my SoP, which I wrote in a hurry for a well-known European laboratory. He was smug, stupid, and overloaded with unnecessary details. Be prepared to write a few SoP drafts in order to throw out all unnecessary things and prescribe important things vividly and briefly.
Understand what you want to do at least in general terms. I formulated a scientific question that interests me even before I decided to apply for a PhD. It helped me in the search for scientific leaders, and in writing SoP. If it is difficult for you to formulate a scientific question at least in general terms, then in my opinion this is an alarming sign (see the “why do I need phd” item). On the other hand, be prepared that you will not do the things that are described in the SoP. Such a foolish dualism.
Show the SoP to anyone you trust. It doesn't matter if a person works in science or industry, the main thing is that you care about him. Your task is to evaluate the range of reactions to your text: do you seem to be ingratiating or complacent? Is it clear what kind of person you are? Is it clear what you want? Your task is to reduce the likelihood of an extreme reaction to the text, because it will be watched by so many different people. For example, for me, a couple of very good and painfully accurate advice was given by a friend who does not have a higher education, but at the same time he is very well versed in people. Another couple of good advice was given by a man who read a lot of cover letters.
Write on the case. SoP is not an exhibition of achievements or CV, but a text about why you need a PhD and what exactly you want to do. All your achievements should be in the context of what you can go through a PhD and do research. All other things are best described in the CV or statement of personal history. It’s important not quality but quality, so it’s better to choose two or three percussion achievements and describe them well than to make a text version of the summary. Feedback from people will help you with this (see paragraph above).
Show what kind of person you are. When I looked at my friend's SoP for the first time, it seemed to me bad, because in my opinion there was not enough focus and sense of purpose. Having written my SoP, I realized that there was an important merit in his text - it was clear from the text that my friend was a good person. It is important to understand that universities are looking for not only stars, but also people with whom they enjoy working. It is clear that in the short text it is difficult to show who you are, but if you succeed, it will increase the chance that you will find a supervisor who is close to you in spirit (or filter out those who do not suit you).
Subtract for errors. This advice looks obvious until you realize that you made a typo in the name of the potential supervisor in your dream school, to which you have already applied. It happened to me, and it was very unpleasant. Do not repeat my mistake.
12. CV and cherries on the cake
In this section I will describe what in my opinion can be useful for your resume. Each of the things below does not guarantee anything separately (although a cool GitHub can help a lot), but it can be useful to make your profile a little stronger and stand out from the rest of the candidates.
12.1 Live GitHub
Most likely, you have it. If not, then you urgently need to start it and learn how to use it, because it is a daily work tool in many universities. Most likely, you have github, but there are not very many interesting things there. How to fill it? The best option is the reproduction of well-known ML / DL articles in some well-known TF / PyTorch / Keras framework. I had no such thing, but I repeatedly saw this advice from cool guys like Bengio, so don’t repeat my mistake. It is important to understand that it is unlikely that it will be possible to make a githabor alive in a couple of months, so start working on it as soon as possible. If you have a scientific article and you can put the code, do it, because it is the best demonstration of your code. Another option is a normal code from ML competitions, even if you did not win prizes.

12.2 ML Competition Experience
If you are interested in ML, then most likely participated in Kaggle competitions. For me, this is a great way to get out of your comfort zone and try new tasks. It must be understood that Kaggle requires a lot of time and mental rigidity. As a rule, all the obvious things have already been done by others or described in public kernels, so you constantly have to invent something new. That is why it is very useful. A useful habit (which I never started) is to clean up the code after the competition and post the documented solution on github.
If you are in Moscow, then there is a
cool ML workout group in Yandex, where people who receive fresh gold / high silver regularly perform. They also have a
YouTube channel with recordings of speeches.
Of the minuses of the kaggl: it takes a lot of time, requires a lot of computing resources and some of the skills needed to win are very specific. But in my opinion, the pros outweigh the disadvantages, especially if you try to summarize your experience, and not just pack public kernels (which I myself did a little more often than necessary).
12.3 Personal site
Before applying, it would be good to have a small website that describes who you are, your projects and aspirations. Most of the applications in universities provide an opportunity to include a link to your site. I built my site after my applications for PhD, so there were no links to it in my applications. The most I could do at this stage was to add a link to my site on LinkedIn, Github and Google Scholar. The main reason why I did not make the site right away was that I chose an overly complex engine that I didn’t understand completely. As soon as I found another simpler and minimalistic engine, I made the site in a couple of days. Again - do not repeat my mistake and make the site in advance.
12.4 Google Scholar
If you have articles, then you need Google Scholar. It's that simple.
12.5 Coursera
Her presence is better than her absence, but judging by the summaries of accepted students that I have seen, it is quite common.
13. Application process
Be prepared that filling out applications is a time-consuming procedure that will take you 2-3 hours to go to a university at best. Each university has its own application system and they can be very different from each other. For example, universities have different requirements for uploading documents. For example, in one university they require one pdf no more than 2 MB, in the other they require downloading individual transcript pages in separate files, in the third they are asked to drive in key courses with their hands. Or, at the university, two registrations are required at once - one on the university website, the other on the department website where you are applying. Even for each university there are dynamic things that need to be followed: whether the results of the GRE, TOEFL and recommendations reached. In addition, if you write your statement of purpose to each university, it would be good to keep references to them in one place.

I organized the application process through Google Doc, where I kept all the necessary information: university name, application deadline, potential academic leaders, link to login page, login itself, GRE and TOEFL status, status of all three referees, university response status (useful in waiting), reference to statement of purpose, and so on. In addition, I had a simplified table for referees to make it convenient for them to keep track of when and where to send recommendations. This system worked very well for me.
14. Waiting and response timelines
So, you sent all the applications, made sure that the requests for recommendations reached the addressees and they sent them in time. Time to relax, right? If you are one of those people who answered "yes", then you can not read this section. If you are one of those to whom the expectation and uncertainty are hard for me, then you will continue.
The deadline for university responses to applications in the United States is April 15. The problem is that the distribution of answers is very different from university to university. Somewhere answers begin to arrive at the end of January-February, and somewhere not earlier than March-April. Guessing is very difficult, so I went the other way.
There is such a site -
thegradcafe.com , where applicants themselves post their applications and their status. I rasparzil these applications over the past five years and made schedules of decision-making on all universities on average and on universities of interest to me. The timeline for all universities looks like this (
link to the image is larger ):

Timeline for specific universities you can find
in this album . It can be seen from them that in most cases you can not especially strain yourself if there are no answers until the beginning of March. If there are no answers until mid-March, then this most likely means that if you are on short lists, then most likely you are at their end. But at the same time it is important to understand that you can get your offer at the beginning of April (as was the case with one of mine).
15. How to deal with failures

Failures are very unpleasant and offensive. And the more you want in a particular university, the more offensive the refusal. I used several mental techniques to reduce the pain of failure.
I reminded myself that universities physically cannot take all good students. Machine learning now is an insanely competitive field, so thousands of applications for a few dozen PhDs at the university, most of which are in one way or another. Under these conditions, you can be a strong candidate, but not pass, because you have not chosen. The two refusals that came to me mentioned the number of people who applied for CS PhD in this university: in one there were about 1,500 people, in the other - about 2,000. This means that yours are very low, even if you are a strong candidate.
I started a tradition of rejection latte. Every time when I was refused, I went to a coffee shop nearby, bought myself a big tasty latte and slowly drank it. It worked surprisingly well: even in case of refusal, I received a small reward. In total, I received nine failures, so it turned out about five liters of coffee.
I whined a couple of close people. In such a stressful situation, it is important for someone to speak out. This does not mean that you have to pour out your sorrows to such people day and night, but it is good when there is someone to listen to and joke together or in response.
16. My results and a few words about a visa to the USA
As I wrote above, I went to 11 universities (10 in the USA and Canada, one in Europe). About half of them were very popular and famous, such as Berkeley, UT Austin or NYU. The second half was just good universities, which in my opinion were below the radar. As a result, I was taken to two good universities below radar (one of which is in the top 10 by CS Rankings), which I consider to be a success.

The process of obtaining a study visa in Moscow took 2.5 months from me and I had to postpone the start of studies to the spring semester. This is another evidence of how important the choice of a country is when choosing a university — it may turn out that all your efforts will be complicated by political tensions between countries or by the immigration policy of the country where you entered the PhD.
useful linksA very sobering 2004 article on how, after tightening immigration policies under Bush Jr., a heap of PhD students and professors in the United States had problems with visas
17. Conclusion and thanks
Regardless of the outcome of the situation with my visa, I think it was a rewarding experience. At least because I wrote this guide, which will help you to avoid my mistakes (and make your own). If you do decide to enroll in a PhD, then good luck with that!
Many thanks to Pavel Nesterov (
mephistopheies ), Ekaterina Arkhangelskaya, Gleb Posobin, Maxim Artemyev, Yulia Denisova and Anvar Kurmukov for their comments and help in writing this guide.