📜 ⬆️ ⬇️

From physicists to Data Science (From science engines to office plankton)

Introduction


Not so long ago, namely twelve months ago, my last year of postgraduate studies began at the physics department of the university called University of California, Davis. There was a legitimate question - what to do next? For teaching, the movement of science and other entertainment year will pass very quickly. It was necessary to decide in advance. The main plan was to find a postdoc position, and somewhere in Tokyo, Rio de Janeiro or Singapore, so it seems like you travel, and it seems like you work. In theory, everything was prepared for me: articles, acquaintances, and knowledge in certain areas of condensed matter physics — for three. I began to actively google sites of different universities, in geographically interesting parts of the world, wrote a scientifically oriented CV, subscribed to mailings where postdoc vacancies are published, hinted to all my friends that if I need to say something first. I even talked on Skype with some professors about work in their research groups. In general, everything was rolling somewhere.

At about the same time, an acquaintance of mine drove into our town, who at one time also graduated from our glorious faculty, but a couple of years earlier. The last couple of years, he poked and poked in different offices and finally found a job in a position called Data Scientist. We sat at the bar, battered - what he does, especially not hooked (when every day you try to figure out what and where to quantize, to describe the properties of nanomaterials, stories about how in a certain database something is aggregated somewhere and why important for some sales of office supplies, does not cling at all), but the salary is hooked. For reference, in the US dirty, that is, before taxes:

  1. Graduate Student - $ 27k
  2. Postdoc - $ 45k
  3. Professor - $ 117k


And my friend without professional education and without work experience was immediately accepted at $ 100k. That is, the whole unfriendly team of postdocs famously traveled around. (As it turned out later, he also made a bad bargain, he had to go 130-150k, so he would bypass the professorship.)
')
But happiness is not in money, and not even in their quantity. Money is, after all, a tool and nothing more.

An acquaintance left, and I plunged back into the depths of the academic environment. I just gave lectures in that quarter, that is, prepare a lecture, and quiz'y, and answer students by email'y. I stumbled. But as the quarter ended, I again began to think about where to go after graduation.

What always confused me in an academic environment is how far it is stupefied, not necessarily dynamic. All sit in the comfort zone and refuse to get out of it. Boring But judging by the films about Silicon Valley, everything is dynamic and youthful. But I didn’t want to go to programmers, firstly, it’s not interesting, and, secondly, they’ll not take me there. All my near-program knowledge is self-study, and there is no fundamental education in this area. Then just another friend showed up, who just recently graduated and he also settled on the position of Data Scientist. And whether he himself hangs on his ears better, or whether his work really was more interesting, but this time I was hooked.

Began to google: nothing is clear. Data Science is mentioned in completely different contexts, the descriptions of requirements for the Data Scientist position are fundamentally different from the vacancy to the vacancy. There are a lot of beautiful words about big data and artificial intelligence, but they don’t make any difference to the big picture.

We had to start somewhere. And the first step I took was enrolling in the Data Science specialization on Coursera. In the specialization of 9 courses, each month is long, so that all this, in theory, stretches for 9 months, but I did not have this time. It was January, and I was going to be released in June. That is, there was no time left to gain knowledge in a completely new area for me, and even find a job. Therefore, these 9 courses I took three at a time. Sometimes it was hard, but overall this is real.

What I brought out of this specialization: Data Science is a dark matter, in the sense that everyone is trying to pull these two words on everything that is somehow connected with the data. But it became clear that the universal Data Scientist should know the statistics, understand machine learning and be able to write code in R, Python, Java, Scala.

It was March. A certain structure appeared in my head, but since this specialization is very basic in terms of level, and the lecturers, in terms of the level of their teaching and the general organization of the courses, let's say honestly, at C grade, I didn’t take much from there. But! One of the courses mentioned a website where you can practice your knowledge of machine learning, namely kaggle.com . And in my case, the information about the existence of this site greatly helped me with further work. I poked around, a couple of competitions failed miserably, but then got involved and the next many months despite the chronic lack of time I took part in all the competitions.

In parallel, I wrote the first version of my resume, tried to improve my LinkedIn profile , I even got a couple of interviews. But in general, time went on, I was not looking very actively for work, word for word - June, I defended myself, I cannot postpone the search for work anymore. And here I rolled up my sleeves and started acting.

Main part


Before that there were tales. And now I will try to write more structured, because the search for a job is a serious matter.

The interview process for the Data Scientist position in the USA consists of the following stages:
  1. The resume gets to the recruiter.
  2. If a recruiter likes a resume, you get to the next step, namely a telephone conversation with a recruiter.
  3. If this telephone conversation was successful, you move to the next step, namely, a conversation with a member of the Data Science team.
  4. If this conversation was successful, you move on to the next step, namely the telephone technical interview. Usually in shared google docs, collabedit or in some similar tool.
  5. As a rule, if there are no doubts about your technical skills, then you will be invited to an on-site interview where a lot of different people will interview you for many hours with a lunch break.
  6. If the previous stage has passed normally, you will be offered a job and you will begin to stipulate the details. (negotiation)


This is a standard set, but the question of how to hire the right Data Scientist is very serious. Therefore, each company has its own approach, for example, after a technical interview you can be given some data on the house, which must be analyzed somehow and presented as a presentation on an onsite interview (Example - Pivotal, Bidgely, Uptake), or this task will give before the technical interview (Example - Capital One). May be asked to solve puzzles on HackerRank (Again, Capital One). And they can skip the technical one and immediately invite to onsite (Example Affirm).

Between each of the steps can take from one day to several weeks, so you must start in advance! In large companies like LinkedIn or Google, you can safely apply for work 9 months before graduation. (It was one of my serious miscalculations, I never expected that finding a job takes so much time.)

Each of the steps in this process requires different skills. So.

Resume / LinkedIn Profile


First, you should look good on paper. This is a LinkedIn profile and your resume. (Anyone interested can come to me on LinkedIn and copy from there everything that you like. It worked for me, maybe it will help you in some way.)

The common mistake of people who make up a resume / fill in LinkedIn is that they insert into it what they are proud of, and not at all what it really needs to be written there. For example, the criterion of your personal steepness in an academic environment is your articles (the order of the authors is important, it is better to be the first - this is terribly fashionable), speaking at conferences and other achievements, which are in general, beyond your closed world, are indifferent. They are dear to you, you have thought hard of them for the past many years, but you don’t need to write about them. In a pinch, you can mention.

It is necessary to write in the resume that is on sale on this specific vacancy. In theory, it is better to write a separate resume for various vacancies, but this is very dreary. The main task of the resume is for the recruiter to contact you and set up a telephone conversation.

All my work experience is industrial mountaineering, teaching and movement of science at the university and service in the army. It does not sell.

Scientific publications and presentations at conferences on topics that are not directly related to Data Science - do not sell.

Education is sold, but bad. I have the strong impression that in the San Francisco Bay Area no one wants to look at your resume if you have no work experience, a PhD in something, or at least a master’s degree in Computer Science. Complicated by the fact that graduates (Fresh Grads) are divided into people of the first grade (graduates of Stanford and UC Berkley) and all the rest. It is common and quite expected that you will not get phone screening just because you don't have a PhD, and even if you have it, you still won't get phone screening, because you are not from Stanford. (There are quite a few startups that have a hard and fast rule. I only recruit from top schools. I don’t know about large companies, but I think they are more appropriate to the process and suffer less from such garbage). In short, the resume goes education, but without details (the name of the university, specialization, period of study).

It is well appreciated if what you did while studying at the university is related to data analysis, especially if the recruiter can understand at least at the idea level how this knowledge can be applied in the company. (You can lie here, but not very much.)

A few lines in my resume are devoted to the results of machine learning competitions mentioned above. I vbuhiv a lot of time in kaggle, so it fit me well in the resume.

An important but non-obvious summary of the summary is Communication and Leadership. The idea is that the academic environment distorts the personality in the sense that it is difficult to communicate with “nerds”. Plus, often in a team they do not know how to work. Here my teaching was useful to me, at least as a line in the resume, which tries to say that I can explain technically difficult topics to people who understand little of it.

And anyway a lot of free space. There I wrote the titles of online courses, which I took on coursera and edx for the Data Science-related topic and subsection called Independent Coursework. Americans love the word Independent, and Coursework sounds good.

Actually that's all. In fact, only a scientific degree, kaggle and a lot of water. But God be with him. The task of the resume is to get the phone screening.

It turned out like this .

How to make your resume get to the recruiter?
  1. LinkedIn is a list of jobs that have been created on LinkedIn itself, as well as those jobs that LinkedIn has pulled off other resources. The disadvantage is easy access to the list of vacancies and, as a result, many applicants. 300 - 1000 applicants for one vacancy is normal. Of the advantages - there are many vacancies, you can massively apply to everything you can.
  2. dice.com - there are some vacancies there too, but I have not received a single interview with them.
  3. monster.com - there are also some vacancies, but I registered on it quite late
  4. The jobs section at kaggle.com
  5. Friends who work somewhere can advise you. (I got an interview with Google and Pebble)
  6. Friends who got a job offer rejected it, but instead advised you. (I got an interview with Uptake and Bidgely)
  7. The Career Fair at UC Davis was useless, and the Career Fair at Stanford or Berkley would not leak without a student ID card. But I later realized this. If the brain had previously turned on, maybe something would have been invented.
  8. Meetups - in the Bay Area almost every day various meetings are held on topics related to Data Science. There you can at least get acquainted with someone, and at the most, you can be impressed (A recent example is that I had to wait for traffic jams, I went to a rally that took place near Deep Learning in Natural Language Processing. And this topic has never been to me It was given separately and neural networks and NLP work, and when you cross them, the result was mediocre, so I went enlightened. But I didn’t guess. There were all inexperienced there, so I read two hours at the blackboard for them I know on this subject and the next day a couple are present at the mitapu Ali, that they have a vacancy at work well here just for me. But it is rather an exception. And mitapy where I do not learn anything I do not like.).


And it looks like you have a wonderful resume, and you are sending it, but something is not responding to you. One of the problems is that in large companies, where recruiters are experienced, there are lots of candidates, and they are recommended, and you alone. You are lost in the total mass. But, recruiters are adequate. (I note in a good way Googe, Pivotal and LinkedIn. I emphasize Michael Obukhov in LinkedIn, I don't know what he wrote there in the report on the results of the interview, but asked good questions and exclusively on the matter)

With startups, the situation is different - there are young and inexperienced recruiters and they don’t really know what they want to see in the resume. For example, job advertisements for large companies are short but specific, and many small startups have a sea of ​​demands. For example, there was one startup who wanted from a potential Data Scientist:

  1. Expert level knowledge of machine learning algorithms.
  2. Knowledge of expert level statistics
  3. Expert knowledge of genetics.
  4. Ability to write production quality code
  5. Ability to work with all kinds of databases.
  6. Naturally, you should have a PhD in technical specialty.

And another sheet of requirements. Moreover, they did not offer candidates the job, but offered a low-paid contract for several months, according to the results of which you would probably be transferred to full-time. Finding a candidate who fits these criteria, and even agrees to work for food - is unrealistic.

This is me to the fact that sending your resume is necessary everywhere and everywhere. Even if this job is not interesting to you. Each interview is an experience of passing the interview. And this experience is for the one who does not know how to do it, its weight in gold.

Phone screening


Your recruiter is calling. What does he want? And he wants (more often it is she) to add comments to your resume.
Typical questions:
  1. Why do you want to work in our company?
  2. Have you finished your education and if not, when are you finished?
  3. Interview with other companies?
  4. What is your visa status? And when will your visa allow you to go to work?
  5. What is your data experience?
  6. A bunch of questions on the resume with answers to which you have to convince her that she will not get a hat for wasting precious time when she sends your resume with comments to someone she needs to pass it on.


It's all straightforward. The better your resume is, the less stupid questions are asked. And when I say that the resume is good, I do not say that you are well covered there - I say that the recruiter will like this lighting. In essence, your resume should be tailored to her / his expectations from you. Usually it goes from this to the next step without problems, although there are exceptions - once I flew, because the office works with some secret data, and I cannot be allowed to do so due to Russian citizenship.

Talking to a member of the Data Science team


The conversation is similar to a conversation with a recruiter, but more technical. About the visa there is no longer asked.

Muddy motifs begin. People come to Data Science from various fields: Computer Science, Statistics, Physics, Math, Economics, Biology, etc. And usually begin to interview almost immediately. That is, there is really no work experience, there is really no interviewing experience, but there are ideas and a desire to practice. And here they come across you ...

They want from you a lot of different things.

Typical questions:
  1. Give an example of your work with data?
  2. This is your task, how would you approach it?
  3. But what problems would we have if we took this data, this algorithm and tried to answer this question?


Here I shot Kagl . After half a year of working on various puzzles on this topic, I can talk for hours. But without kagla, I would have swum here. The range of questions on one side is narrow - about the data and about your experience, but with the other it is immense, because you can ask about anything from Machine Learning, Statistics, about Use Case, and it is not necessary that the question be at a basic level. And on topics different chase. Can about Natural Language Processing, about Credit Card Fraud Detection, and about Recommender Systems. And there are no guarantees that they themselves understand this topic at least somehow. Often like to ask questions that do not know the answer, and they themselves are tormented by work. You train to go through interviews, and they train you to interview people, and as you know, stupid questions are easier to ask than to answer.

There was such a case. In Pebble, a guy asked me: “ Geeks use our products, but how would we start promoting our watches in a non-gikovsky environment ?”. I replied to him: " I will say without Data Science - dismiss your designer. In your watch, no self-respecting president will declare a third world war, even if he wants. He is just a shy in public with your products on hand. ". The next morning, I received an email saying that I didn’t like them. But god with them.

What helps is to go to GlassDoor and see what questions the data analyst, Data Scientist, Software Developer are asking for the position, and solve all of them.This is not a panacea, but often on the interview come across puzzles that have met somewhere.

On one knowledge on this step not to leave, it is necessary to think still a head. For example, there was such a question - but how would you reproduce the Swype algorithm? The experience of competitions on Kagla helped me, I generated ideas like a fountain, and, as it turned out, my interviewer was very keen on.

Again in big campaigns or big startups there are more distinct questions, more adequate interviewers. For the better, I note (LinkedIn, Google, Pivotal, Bidgely, Affirm). For worse (Pivotal, Pebble, Turn, Workday, Leap Motion). (Pivotal twice, because I passed this stage twice. And once I got on a self-confident aunt with a low IQ and didn’t grow together with it.)

Technical Interview


Data Science team. google docs collabedit - . . , . .

.
  1. , .
  2. Statistics
  3. Programming. python, R, Java.
  4. —
  5. MapReduce


. — . Computer Science — , ..

— GlassDoor background .

Onsite interview


This is a marathon for several hours at the company's office. And if you get from another city, you will be paid for the flight and the hotel. (I flew very nicely in Chicago)

A bunch of different people communicate with you for half an hour each. In the middle, as a rule, lunch.

Usually one at a time. But LinkedIn works beautifully in pairs. An experienced person presses you, and the second, who has recently settled down, is studying, from an older comrade, although sometimes she also asks questions.

Here, and writing code on the board, and how would you attack this problem, and questions on the theory, and just talk for life.

Onsite , , . , , . , onsite , , , , , . , .

Conclusion


Finding a job in the Data Science area of ​​the San Francisco Bay area is not easy. Especially if I do this at the last moment. This is a nervous process that takes a lot of time and effort. In many ways, because the process itself is long. And while you are interviewing with many companies in parallel. Two - three interviews per day - this is normal. Getting annoying, and then you get used to it. I graduated in June, and received a job offer only in October. Yes, this is the first job, and in any case it is hard to find. But every day of these months are dying nerve cells that are not restored, not only from you, but from your friends, family, and all those to whom you are not indifferent.

Is it possible to cut the corner and not cut through all these stages, or at least simplify them? Yes you can.There are organizations that are committed to taking talented graduates, teaching and helping to find a job (Example Insight Fellowship , Data Science Incubator ). But!The number of places is very limited, and the number of applicants is huge. And on paper, they almost certainly look better than you. But I know a few people who were selected by Insight and they didn’t have problems finding work. So, I recommend actively submitting to all my acquaintances who have all this epic job search in Data Science to these organizations.

Another opportunity to cut a corner is an internship in some company. If I were smarter, I would try to squeeze into some internship every summer that I studied at UC Davis. Life would be so much easier.

The question is: was it worth it and what had changed?

An alternative to finding a job in Data Science was the postdoc position. Of the benefits is the same familiar workflow as in graduate school, a sea of ​​free time. Among the shortcomings, the number of vacancies for postdoc positions is very limited and finding such a vacancy is much more difficult than finding a job. That is, to choose a place where to live and work, almost none. For the money, everything is sad, and it is not clear that with the prospect. Approximately 3% of those who go to the post-doctorate after graduate school, after 5-10 years of postdoc find a professor's position, again where they will be given this position, and not where they want. As a rule, almost all (there are genuinely pleasing exceptions) my familiar post-docs in parallel move science and search for work, many in Data Science.

Position Data Scientist'a:


What changed? . , , , , , , .

, .

.

Source: https://habr.com/ru/post/295954/


All Articles