📜 ⬆️ ⬇️

Internship at JetBrains and how I almost managed to get on it

image

Like many young developers, when there is a desire to find a job / internship - I look in the direction of the coolest IT companies.

Recently, I tried to get into the ranks of JetBrains and under the cut is ready to share my experience.

Why "almost" succeeded?


Surely you immediately have such a question.
')
In my opinion, I have a good resume with a bunch of achivok and a good skill, which I have been improving over the last 8-9 years, day after day.

I completed the test task (and it seems to me well), had previously visited the JB office, which is good in my city, communicated with HH and some of the developers of the company, and as a result was refused an internship without any comments.

Most likely the reason lies in the fact that JetBrains selects only students for an internship, and at the moment I just graduated from the 11th and pass exams one by one.

Well, this is a reason for another year to train yourself and apply for the next year.

Test case analysis


The deadlines for submitting applications for internships and testing of test tasks are over, which means that everyone who solved them, including me, can lay out an analysis of these tasks so that next year any student can get acquainted with the approximate level of assignments with JB which he will have to face and in which case he will tighten up his knowledge.

I applied for an internship in the development team of the Korutin debugger for Kotlin.

The task of this team on internship for those who got it this year will be the refinement of this part of the debugger and its integration with the IDE.

The task was a bit expected - to write a debugger for a small PL.

I would not say that it is difficult, rather the opposite. It does not require any deep knowledge of the theory of building translators and a cool skill. But nevertheless, those who are applying for an internship in this area must at least have these basics and easily cope with this task. I was surprised when I decided to search on github for the keywords of the solutions of my “competitors” and found 1-2 more or less working-type solutions against about 6-7 empty repositories or with a couple of pieces of code, after which people gave up. Maybe I was looking badly, but nevertheless the results did not please me. If this post is read by people who have abandoned this task, do not do this in the future. In extreme cases, it was enough to sit on the task for a couple of days and I am sure you would have coped with it.

Task text itself
Task: implement step-by-step execution of the code for the trivial programming language Guu.

Attention: in the description below, some essential points are obviously omitted. As a rule, they remain at your discretion. If it is completely incomprehensible, write to (here the mail, which I decided to remove).

The program on Guu consists of a set of procedures. Each procedure begins with the line sub (subname) and ends with a declaration of another procedure (or the end of the file, if the procedure in the file is the last). Execution begins with sub main.

The body of the procedure is a set of instructions, each of which is on a separate line. At the beginning of the line may appear insignificant tabs or spaces. Blank lines are ignored. There are no comments on Guu.

In Guu, there are only three operators: - set (varname) (new value) - setting the new integer value of a variable. - call (subname) - procedure call. Calls can be recursive. - print (varname) - print the value of the variable on the screen.

Variables in Guu have a global scope. The program below will display the string a = 2.

sub main
set a 1
call foo
print a

sub foo
set a 2

But the simplest program with infinite recursion:

sub main
call main

You must write a step-by-step interpreter for Guu. When it is started, the debugger should stop at the line with the first instruction in sub main and wait for commands from the user. The minimum required set of debugger commands:

i - step into, the debugger comes in call (subname).
o - step over, the debugger does not go inside the call.
trace - print stack trace execution with line numbers starting from main ...
var - print the values ​​of all declared variables.

The format of the user's communication with the debugger remains at a higher discretion. You can choose both minimalistic GDB-like interface and console or graphic UI. The names of the debugger commands can be changed if desired.

To solve this problem, you can use any programming language from TIOBE TOP 50 and open-source compiler / interpreter.

In evaluating the work will be evaluated:

The overall performance of the program;
The quality of the source code and the availability of tests;
Easy extension of functionality (for example, support for new language statements or debugger instructions).
The solution with the instruction for its assembly should be published in the Git-repository (for example, on GitHub or BitBucket). In the answer you need to specify a link to the repository. The link to the private GitHub repository will do as well, only you will need to add me to it.

I am writing in C ++, Java and Object Pascal.

At first there were thoughts to write everything on my YaB (Mash), but I thought that it would not be very convenient for the JB employee to check, and I filed an application 2 days before the closing of the submission (exams still ...), and it was already evening outside the window - I decided to quickly write everything in more known languages.

To solve the Pascal problem, in my opinion, it is most suitable, at least because of the most convenient implementation of strings ...

At least for me. In addition, it is in TIOBE TOP 50, so I boldly launched an IDE, namely, Lazarus, since He is not commercial :) and started to solve the problem.

Despite the fact that JB gives as many as 7 days, in total I spent about an hour in total, and the project turned out to be about 500 lines of code.

Where to begin?


First of all, you need to imagine how the debugging of the code will eventually work.

We need to implement step-by-step code execution - each instruction should be represented as a structure / class and, in general, instructions should look like a list of these classes or, as in my implementation, refer to each other to form a sequence (later I will write down why I did so).

To get this sequence, our debugger needs to process the code in the proposed language, which means we also need to implement a small parser, as well as syntactic and semantic analysis of the code.

Let's start with the implementation of the parser. Because Guu language consists of a set of tokens, separated by a space, then it is logical first of all to write a small and simple tokenizer:

function GetToken(s: string; tokenNum: word): string; var p: word; begin s := Trim(s); s := StringReplace(s, ' ', ' ', [rfReplaceAll]); while tokenNum > 1 do begin p := Pos(' ', s); if p > 0 then Delete(s, 1, p) else begin s := ''; break; end; dec(tokenNum); end; p := Pos(' ', s); if p > 0 then Delete(s, p, Length(s)); Result := s; end; 

Further we declare enum from tokens:

 type TGuuToken = (opSub, opSet, opCall, opPrint, opUnknown); const GuuToken: array[opSub..opPrint] of string = ( 'sub', 'set', 'call', 'print' ); 

And the instruction class itself, into which we will parse the lines of code:

 type TGuuOp = class public OpType : TGuuToken; OpArgs : TStringList; OpLine : Cardinal; OpUnChangedLine: string; NextOp : TGuuOp; OpReg : Pointer; function Step(StepInto: boolean; CallBacks: TList; Trace: TStringList): TGuuOp; constructor Create(LineNum: Cardinal; Line:string); destructor Destroy; override; end; 

The instruction will be stored in OpType, and the remaining parts of the structure in OpArgs.
OpLine, OpUnChangedLine - information for the debugger.

NextOp - pointer to the next instruction. If it is equal to nil (null in Pascal), then there are no instructions and you need to complete the execution of the code, or return to the callback stack.

OpReg is a small pointer-register that will be used later for a small optimization of code execution.

After the class declaration was written - I decided that the most compact and beautiful solution would be to add a parser and a little syntax analysis in its constructor, which I further did:

 constructor TGuuOp.Create(LineNum: Cardinal; Line:string); (* * That method parse code line. *) var s: string; w: word; begin inherited Create; OpArgs := TStringList.Create; OpLine := LineNum; OpUnChangedLine := Line; NextOp := nil; OpReg := nil; s := GetToken(Line, 1); OpType := TGuuToken(AnsiIndexStr(s, GuuToken)); case OpType of opSub : begin // sub <name> s := GetToken(Line, 2); if Length(s) > 0 then OpArgs.Add(s) else begin writeln('[Syntax error]: Invalid construction "sub" at line ', OpLine, '.'); halt; end; if Length(GetToken(Line, 3)) > 0 then begin writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.'); halt; end; end; opSet : begin // set <var> <value> OpArgs.Add(GetToken(Line, 2)); OpArgs.Add(GetToken(Line, 3)); w := 1; while w < Length(OpArgs[1]) + 1 do begin if not (OpArgs[1][w] in ['0'..'9']) then begin writeln('[Syntax error]: Invalid variable assigment "', Line, '" at line ', OpLine, '.'); halt; end; inc(w); end; if (Length(OpArgs[0]) = 0) or (Length(OpArgs[1]) = 0) or (Length(GetToken(Line, 4)) > 0) then begin writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.'); halt; end end; opCall : begin // call <name> s := GetToken(Line, 2); if Length(s) > 0 then OpArgs.Add(s) else begin writeln('[Syntax error]: Invalid construction "call" at line ', OpLine, '.'); halt; end; if Length(GetToken(Line, 3)) > 0 then begin writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.'); halt; end; end; opPrint: begin // print <var> s := GetToken(Line, 2); if Length(s) > 0 then OpArgs.Add(s) else begin writeln('[Syntax error]: Invalid construction "print" at line ', OpLine, '.'); halt; end; if Length(GetToken(Line, 3)) > 0 then begin writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.'); halt; end; end; else begin writeln('[Syntax error]: Invalid token "', s, '" at line ', OpLine, '.'); halt; end; end; end; destructor TGuuOp.Destroy; begin FreeAndNil(OpArgs); inherited; end; 

Here we essentially check the beginning of the construction (ie, the first word), and then we look at the other tokens and their number. If something is clearly wrong with the code, then we display an error.

In the main piece of code, we simply read the code from the file in TStringList, call the TGuuOp constructors line by line and store the pointers to the instances of the classes in GuuOps: TList.

Ads:

 var LabelNames: TStringList; GuuOps, GuuVars: TList; SubMain: TGuuOp = nil; 

In conjunction with the code parsing, it would be nice to perform a couple more actions:

 procedure ParseNext(LineNum: Cardinal; Line: string); (* * Parsing code lines and define variables and labels. *) var Op: TGuuOp; GV: TGuuVar; c: cardinal; begin if Trim(Line) <> '' then begin Op := TGuuOp.Create(LineNum, Line); GuuOps.Add(Op); case Op.OpType of opSet: begin // define variable and/or optimisation var calling GV := nil; c := 0; while c < GuuVars.Count do begin if TGuuVar(GuuVars[c]).gvName = Op.OpArgs[0] then begin GV := TGuuVar(GuuVars[c]); break; end; inc(c); end; if GV = nil then begin GV := TGuuVar.Create(Op.OpArgs[0]); GuuVars.Add(GV); end; Op.OpReg := GV; end; opSub: begin // Check for label dublicade declaration if Op.OpArgs[0] = 'main' then SubMain := Op; if LabelNames.IndexOf(Op.OpArgs[0]) <> -1 then begin writeln('[Error]: Dublicate sub "', Op.OpArgs[0], '" declaration at line ', Op.OpLine, '.'); halt; end else LabelNames.Add(Op.OpArgs[0]); end; end; end; end; 

At this stage, you can check the entry points at the time of the override and recall OpReg - I used it to store a pointer to a Guu variable.

Speaking of variables, this small piece of code was taken to a separate unit:

 unit uVars; {$mode objfpc}{$H+} interface uses Classes, SysUtils; type TGuuVar = class public gvName: string; gvVal: variant; constructor Create(VarName: string); end; implementation constructor TGuuVar.Create(VarName: string); begin inherited Create; gvName := VarName; gvVal := 0; end; end. 

Now we have parsed code, which seems to be correct in syntax. It remains to analyze it and you can proceed to the implementation and the most important thing - debugging.

Next, you should implement a small semantic analysis and simultaneously prepare everything for the execution and debugging of the code:

 procedure CheckSemantic; (* * Semantic analyse and calls optimisation. *) var c, x: cardinal; op: TGuuOp; begin if GuuOps.Count > 0 then begin if TGuuOp(GuuOps[0]).OpType <> opSub then begin writeln('[Error]: Operation outside sub at line ', TGuuOp(GuuOps[0]).OpLine, '.'); halt; end; c := 0; while c < GuuOps.Count do begin case TGuuOp(GuuOps[c]).OpType of opSub:; opCall: begin TGuuOp(GuuOps[c - 1]).NextOp := TGuuOp(GuuOps[c]); x := 0; op := nil; while x < GuuOps.Count do begin if TGuuOp(GuuOps[x]).OpType = opSub then if TGuuOp(GuuOps[x]).OpArgs[0] = TGuuOp(GuuOps[c]).OpArgs[0] then begin op := TGuuOp(GuuOps[x]); break; end; inc(x); end; if op <> nil then TGuuOp(GuuOps[c]).OpReg := op else begin writeln('[Error]: Calling to not exist sub "', TGuuOp(GuuOps[c]).OpArgs[0], '" at line ', TGuuOp(GuuOps[c]).OpLine, '.'); halt; end; end; opPrint: begin TGuuOp(GuuOps[c - 1]).NextOp := TGuuOp(GuuOps[c]); x := 0; while x < GuuVars.Count do begin if TGuuVar(GuuVars[x]).gvName = TGuuOp(GuuOps[c]).OpArgs[0] then begin TGuuOp(GuuOps[c]).OpReg := TGuuVar(GuuVars[x]); break; end; inc(x); end; if TGuuOp(GuuOps[c]).OpReg = nil then begin writeln('[Error]: Variable "', TGuuOp(GuuOps[c]).OpArgs[0], '" for print doesn''t exist at line ', TGuuOp(GuuOps[c]).OpLine, '.'); end; end; else TGuuOp(GuuOps[c - 1]).NextOp := TGuuOp(GuuOps[c]); end; inc(c); end; end; end; 

In TGuuOp.NextOp of each token we write a pointer to the next token.
For the call opcode, we do everything cleverly and simply - we write a pointer to the called entry point in NextOp.

We also check the output variables using the print instruction ...

Maybe they were not announced before the withdrawal?

Now you need to implement code execution. We return to the TGuuOp class and implement the Step method:

 function TGuuOp.Step(StepInto: boolean; CallBacks: TList; Trace: TStringList): TGuuOp; (* * That method execute instruction. *) var Op: TGuuOp; CBSize: Cardinal; begin case OpType of opSub: begin Trace.Add('-> Sub "' + OpArgs[0] + '"'); Result := NextOp; end; opCall: begin if StepInto then begin if NextOp <> nil then CallBacks.Add(NextOp); Result := TGuuOp(OpReg); end else begin Op := TGuuOp(OpReg); CBSize := CallBacks.Count; while ((Op <> nil) or (CallBacks.Count > CBSize)) and (Trace.Count < STACK_SIZE) do begin if Op = nil then begin Op := TGuuOp(CallBacks[CallBacks.Count - 1]); CallBacks.Delete(CallBacks.Count - 1); Trace.Delete(Trace.Count - 1); end; Op := Op.Step(StepInto, CallBacks, Trace); end; Result := NextOp; end; end; opPrint: begin writeln(TGuuVar(OpReg).gvName, ' = ', TGuuVar(OpReg).gvVal); Result := NextOp; end; opSet: begin TGuuVar(OpReg).gvVal := OpArgs[1]; Result := NextOp; end; end; end; 

To avoid access violation in case of looping, it is better to limit the stack, which I did.
The constant STACK_SIZE = 2048, declared above, is just responsible for this.

Now it's finally time to write the main code of our debugger:

 var code: TStringList; c: Cardinal; cmd: string; CallBacks: TList; Trace: TStringList; DebugMode: boolean = true; begin if ParamCount > 0 then begin // Initialisation if not FileExists(ParamStr(1)) then begin writeln('[Error]: Can''t open file "', ParamStr(1), '".'); halt; end; if ParamCount > 1 then if LowerCase(ParamStr(2)) = '/run' then DebugMode := false; code := TStringList.Create; code.LoadFromFile(ParamStr(1)); GuuOps := TList.Create; GuuVars := TList.Create; // Parsing and preparing LabelNames := TStringList.Create; c := 0; while c < code.Count do begin ParseNext(c + 1, Trim(code[c])); inc(c); end; FreeAndNil(LabelNames); CheckSemantic; if SubMain = nil then begin writeln('[Error]: Sub "main" doesn''t exist!'); halt; end; // Start code execution CurrentOp := SubMain; CallBacks := TList.Create; Trace := TStringList.Create; if DebugMode then begin //Out code and features ClrScr; writeln('Code for debugging:'); writeln('.....'); c := 0; while c < code.Count do begin writeln(FillSpaces(IntToStr(c + 1), 4), '| ', code[c]); inc(c); end; writeln('"""""'); FreeAndNil(code); writeln(sLineBreak, 'Features:', sLineBreak, '* i - step into.', sLineBreak, '* o - step over.', sLineBreak, '* trace - print stack trace.', sLineBreak, '* var - print variables list.', sLineBreak, '* x - exit.', sLineBreak); // Execution loop while ((CurrentOp <> nil) or (CallBacks.Count > 0)) and (Trace.Count < STACK_SIZE) do begin write('Line ', CurrentOp.OpLine, ' ~> '); readln(cmd); // Execute commands if cmd = 'i' then CurrentOp := CurrentOp.Step(true, CallBacks, Trace) else if cmd = 'o' then CurrentOp := CurrentOp.Step(false, CallBacks, Trace) else if cmd = 'trace' then begin writeln('| Trace:'); c := 0; while c < Trace.Count do begin writeln('| ', Trace[c]); inc(c); end; writeln('| -> Line ', CurrentOp.OpLine, ': "', CurrentOp.OpUnChangedLine, '".') end else if cmd = 'var' then begin writeln('| Variables list:'); c := 0; while c < GuuVars.Count do begin writeln('| ', TGuuVar(GuuVars[c]).gvName, ' = ', TGuuVar(GuuVars[c]).gvVal); inc(c); end; end else if cmd = 'x' then halt; // Check for method end & make callback if (CurrentOp = nil) and (CallBacks.Count > 0) then begin CurrentOp := TGuuOp(CallBacks[CallBacks.Count - 1]); CallBacks.Delete(CallBacks.Count - 1); Trace.Delete(Trace.Count - 1); end; end; end else begin // Only run mode (/run) FreeAndNil(code); while ((CurrentOp <> nil) or (CallBacks.Count > 0)) and (Trace.Count < STACK_SIZE) do begin CurrentOp := CurrentOp.Step(false, CallBacks, Trace); if (CurrentOp = nil) and (CallBacks.Count > 0) then begin CurrentOp := TGuuOp(CallBacks[CallBacks.Count - 1]); CallBacks.Delete(CallBacks.Count - 1); Trace.Delete(Trace.Count - 1); end; end; end; if Trace.Count >= STACK_SIZE then writeln('[Runtime error]: Stack overflow!'); FreeAndNil(CallBacks); FreeAndNil(Trace); end else writeln( 'Guu debugger v1.0.', sLineBreak, 'Author: Pavel Shiryaev (@RoPi0n).', sLineBreak, 'Run: svmc guu_debugger.vmc <guu source file> [arg]', sLineBreak, 'Args:', sLineBreak, ' /run - Run Guu code.' ); end. 

By the condition of the job, the interface can be implemented as you like.

It would be possible to implement a full-fledged UI, to tie SynEdit to the project, but in my opinion, this is empty work, which will not reflect the skill, and besides, for which it will not be paid :)

So I limited myself to a small console UI.

The code above is not difficult, so you can leave it without comments. In it, we take the finished TGuuOp's and call their Step.

Screenshots of the solved problem:

image

image

Error information:

image

image

Link to the repository of my solution: Klats

Results


No special results. I will have to devote most of the summer to intense rest and searching for a university (well, just in case I get a good exam, of course), instead of two months of work and training in the JetBrains team.

Maybe next year a new post will appear on Habré, which already describes the internship process in JB or in another company that is interesting to me :)

Source: https://habr.com/ru/post/454354/


All Articles