Is there any powder in the old dog? Hackathon Radio Canada 2018 (Part Three - Start! Attention! March!)
I present to you the third part of my, a little long, story.
Having received a positive assessment of the first and second parts, I did not want to keep readers waiting too long, but life and reality make their own adjustments.
Tuesday, Wednesday, Thursday of the working week flew by quickly as always. The anticipation of adventure was growing. And by Friday it reached its peak. My wife was warned that on the weekend, starting on Friday, I would fall out of my life and disappear from the radar.
So after work on Friday, I was in a fighting mood and headed back to Radio House.
The first day. Friday evening. Time to scatter stones
Friday schedule
You can see the original here . But the question is: who needs it? I think it is in Russian very few people interested. But on the other hand, someone reads and pushes my story; what if all these little things are my zest?
Friday, March 23
17:00 Arrival and registration (Dining room Radio-Canada closes at 19:00). 18:00 Greeting 18:15 Presentation of mentors, assistants, teams and tips 19:00 Prototype presentation workshop with Matthieu Dugal and Chloe Sondervorst 19:30 Break 19:50 To choose from: 1. Master class Presentation of Microsoft services Presentation of Radio-Canada services 2. Starting teams 22:00 End of the first day
The mention of the dining room was very appropriate. First of all, it reminded me that I would be having a hard stomach after a busy day, and it would not even hurt to have something to eat before starting this action. The asterisk even offered to bring me food, since she came to the hackathon not from college, as usual, but from home, quite full and full of energy. But as always, I outsmarted myself. I thought that we would have time, desire and the opportunity to order a pizza. But in reality it turned out that I was the only one to come on an empty stomach in our team. Therefore, I went to the dining room and picked up some kind of hamburger immediately after registration.
I arrived at around 5:45 pm.
Before entering the hall where the event was held, it was necessary to register and get a T-shirt of the hackathon participant and a small souvenir in the form of a water bottle. The participants were given white T-shirts, and the organizers and mentors were given black. My attempt to get hold of a black T-shirt was not crowned with success. Well, okay, white is also nothing.
When I entered the hall, Asterisk and Plato had already chosen and occupied one of the tables intended for the teams. Engaged in connecting laptops to power and the Internet.
I also prepared my laptop, and then used a couple of minutes left before the start of the presentation to go to the dining room and have a bite.
By the time I returned to the gym, Mercury had already joined the team. Phaethon was not and when it appears was unknown.
With a slight delay from the schedule, at about 6:10 pm, we listened to the welcome word from Maxime St-Pierre . He reiterated that Radio Canada has high hopes for this event, since last year’s experience has brought good practical results. He also mentioned that the topic of the hackathon: AI, is very relevant today. And confirmation of this, for example, literally happened this week scandal with Cambridge Analytica .
You can view the full evening broadcast of this evening:
Maxim did not delay his speech and handed over the baton to the host of the evening Matthieu Dugal .
Matthew, as I understand it, is a very famous television host on Canadian television. He runs several technology programs. This played a positive role in the organization of both this evening and the entire hackathon. On the one hand, his experience in television format has helped a lot in terms of time management. All points of the program, despite the delay in the beginning, ultimately fully fit into the previously announced program. On the other hand, he had no difficulty in pronouncing technical terms, which I think would have happened if the presenter had not been familiar with technical topics.
Over the years, I respect more and more people and organizations that can “answer” for their words. Despite the presence of more and more tools available to all and universally declared “professionalism”, as yet, I often meet people and situations when the organizers do not know how to control their own events. In this regard, Radio Canada was on top.
At first, Matthew once again spoke out all the requirements of the hackathon and the criteria for evaluating the prototypes. I will not list them again. You can re-read them in the first and second parts.
Prototype Evaluation Criteria
It looks like I did not describe the criteria for evaluating the prototypes in the first two parts. Strange, I was sure that I mentioned them.
1st place - XBox One + $ 1000 2nd place - Google Home Max + $ 250 3rd place - $ 500 And two prizes (Xbox One S1), which will be raffled among the remaining participants by drawing.
(as far as we understood the prizes were received by EVERY member of the team)
In spite of the fact that 31 teams were represented on the screen. I and our team still had illusions that half of these teams either would not participate at all, or could not cope with prototyping, or they would be significantly weaker than our team. At that time, I did not even count the number of teams. I was too busy setting up access to Azure and generally watched and listened to everything just by the edge of my ear. Although the appearance of people in the hall made me a little bit alarming. In the sense that I assumed that the majority of the participants will be university students. And my eyes told me that people with beards and glasses were hardly students. That is, visually the composition of the participants was quite mature.
A little adjusting the schedule, Matthew announced a break of 20 minutes. It was very helpful, everyone had to exchange views, walk, warm up, refresh.
During the break, Asterisk asked Mercury to once again explain the architecture of our project and the role of each team member. What I did on paper.
A little prettier and more structural than paper
It turned out something like:
1. WEB page / Widget - HTML + JS
functionality : the visible part of the application is organized in the form of HTML and a search input field of the processed JS. performers : Asterisk - HTML, Mercury - JS margin notes : I offered 3 options for this part: a separate page (the simplest frontal solution), a field embedded in Radio Canada page (I thought that the children would be able to reuse the page structure and CSS classes to draw the search results as blocks / widgets , so as not to waste time on creating my own) and the third option is the same as the second one, I offered to do a search field in the form of a Chrome Extension, because I knew that it was easy to make the simplest browser extension . And the installation of such a plugin allows you to access any page, including the news feed page we need.
2. PHP, MySQL - Search API backend
functionality : the simplest search api, according to our plan, it should receive a query from the search string on the page, search the database and return JSON formatted search results. performer : your humble servant margin notes : since the last project in which I participated in my main job was made on Laravel, and also since this framework is currently a trendsetter, I decided that I would use Laravel . This was not supposed to bring any special surprises, and for my part I will learn some features of Laravel when creating a new project on it, especially not in the mode of the site, so to speak, but in the service API mode.
3. MySQL
functionality : database, storage for content and analysis results margin notes : Since there were no strict requirements on which database to use, Plato asked if it would be better to use MongoDB. For MongoDB, they say the JSON format is almost native and the type should be simpler. Honestly, I can not answer and now how much more correctly to make such decisions on non-relational databases. From what I read on this topic, I conclude that if you want a fairly simple selection of previously known fields, and, in our case, in fact, on a single table, then relational databases are still the best solution. (Correct me if I'm wrong). But my choice of MySQL was due not to architectural preference, but to pragmatic considerations: I just know how to use it. For Plato, both bases were something new and, in any case, an experiment, so he easily agreed with me.
4. Python App
functionality : (Radio Canada -> Python App -> Azure Cognitive Services -> Python App -> MySQL) This part is essentially the most important in terms of the requirements of the hackathon. It is necessary to get the content via / from the Radio Canada API, it is possible to process it, then send the content for analysis to one of the AI ​​services, get an answer and put the content along with the analysis results into the database. performer : Plato notes on the margins : The entire burden of responsibility lay down on Plato in this way. He was a little worried about this. But as I have already noted, he did his homework and felt that he had all the necessary knowledge for the project. Plato and the whole team had no doubt that we would cope with the task.
5. Azure Cognitive Services
functionality : at the time of the start of the hackathon, I did not know which of the services would be involved
6. Radio Canada API
functionality : at the time of the start of the hackathon, I did not know which of the 4 services Plato would be involved in (description 4 of API Radio Canada read in the second part). As in the case of AI services, Plato had to take the decision, based on the fact that it was easier and more convenient for him to implement.
After the break, Matthieu Dugal and Chloe Sondervorst held a master class on preparing prototype presentations. They drew the attention of the audience to the fact that, first, it is necessary to understand that your prototype product is “met according to its clothes”, and therefore success and the chances of winning will depend very much on the presentation of the material. The second thing that the presenters noticed was time constraints. The performance of each team on the final day is given 3 minutes for the performance itself and another 2 minutes for answers to the questions of the jury, if any. 3 minutes is really not enough, so you need to focus your performance on the most important moments and correctly focus on the performance.
If we make a simple mathematical calculation, 31 commands x (3 min + 2 min) = 155 min = 2 h 35 min. And then we add a couple of minutes for all sorts of hitch, changing teams on stage and so on, it becomes obvious that only listening to the teams will take 3 hours at best. But I repeat, at that time I did not understand that 31 full-fledged teams took part in the race . If you remember our original plan when we registered the team, we assumed that there would be no more than 10 teams, of which a couple might fall off by themselves.
Chloe sincerely tried in this short time to give a short course of public speaking and oratory. She told about the main criteria by which you can evaluate your presentation: first impression, history, believability / realism, simplicity, time (time management), emotionality, vision, passion / emotional enthusiasm.
Concluding their part of the speech, the presenters once again drew attention to the stated criteria for evaluating the prototypes.
After completing the master class, participants had the opportunity to ask a few more questions. Some of these questions concerned how functional the prototype should be, who will check it and whether it should include a real demonstration in the presentation or can you do with some kind of static slides with screenshots or something like that.
Answering one of these questions, Maxime St-Pierre clearly stated that the prototype should definitely be functional and it is best to include its demonstration in its final presentation.
I want to draw your attention to this little touch. Because we based our expectations precisely on the basis of the fact that the prototype must be ready and it must meet declared criteria.
After answering the questions, we once again wished good luck. And the work began to boil! (approximately 20:00)
Lots of text to read
Or rather, like work, some kind of activity. For my part, I immediately explained to all members of our team that, if possible, we should develop our parts as independently as possible. At the same time, we should try to avoid interlocking each other. According to our architecture, it turned out that the possible interlock points are transitions: JS <-> PHP, PHP <-> MySQL <-> Python. That is, my part in a certain way can delay / block unavailability or problems with the database. And I, in turn, can be a blocking link for our front end.
Therefore, the first step was to raise the database and check access to it from both PHP and Python.
Plato, meanwhile, dealt with both APIs, deciding which way we would go.
The asterisk and Mercury went about their business. By the way, they pleasantly surprised me with the fact that they immediately raised the common repository on Github and actually engaged in joint development from the first minutes, dynamically exchanging updates. By the way, I still have not filled my code anywhere, although several weeks have passed. (I am ashamed to be honest)
I asked Asterisk to proceed from the fact that it will receive a certain JSON array with elements from the PHP service, each element will have the following minimum set of fields: video_id, title (title), category (some category the content belongs to, maybe it will be: sport, politics, economics. Or maybe a music style or artist, or maybe just a Quebec region), body / description (article body), image / video (url to a video or picture). And accordingly, it must create an HTML layout for displaying these elements. I asked her to design was designed for 6 elements in 2 categories.
First Alpha, appearance, front end
That's what was about ready at our front end team at the end of the day.
HTML generated
<!DOCTYPE html><html><head><metacharset="utf-8"/><linkhref="css/bootstrap.min.css"rel="stylesheet"><linkhref="css/main.css"rel="stylesheet"></head><body><divclass="card-deck"><divclass="card"id="card"><imgclass="card-img-top"src="images/placeholder-images.jpg"alt="Card image cap"><divclass="card-body"><h5class="card-title">Will need info from api</h5><pclass="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p><ahref="#"class="btn btn-primary">Read more ...</a></div></div><divclass="card"id="card"><imgclass="card-img-top"src="images/placeholder-images.jpg"alt="Card image cap"><divclass="card-body"><h5class="card-title">Will need info from api</h5><pclass="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p><ahref="#"class="btn btn-primary">Read more ...</a></div></div><divclass="card"id="card"><imgclass="card-img-top"src="images/placeholder-images.jpg"alt="Card image cap"><divclass="card-body"><h5class="card-title">Will need info from api</h5><pclass="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p><ahref="#"class="btn btn-primary">Read more ...</a></div></div></div><divclass="card-deck"><divclass="card"id="card"><imgclass="card-img-top"src="images/placeholder-images.jpg"alt="Card image cap"><divclass="card-body"><h5class="card-title">Will need info from api</h5><pclass="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p><ahref="#"class="btn btn-primary">Read more ...</a></div></div><divclass="card"id="card"><imgclass="card-img-top"src="images/placeholder-images.jpg"alt="Card image cap"><divclass="card-body"><h5class="card-title">Will need info from api</h5><pclass="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p><ahref="#"class="btn btn-primary">Read more ...</a></div></div><divclass="card"id="card"><imgclass="card-img-top"src="images/placeholder-images.jpg"alt="Card image cap"><divclass="card-body"><h5class="card-title">Will need info from api</h5><pclass="card-text">Some quick example text to build on the card title and make up the bulk of the card's content.</p><ahref="#"class="btn btn-primary">Read more ...</a></div></div></div></body></html>
I started several VMs one with Debian Jessie, another with Ubuntu 16 and tried to put MySQL on them - it didn't work. Either I was in too much of a hurry, or I was too tired, I don’t know why. Even the VMs themselves did not rise from my first attempt. Since I didn’t want to disturb anyone until everything worked, I tried to run each VM in a separate security group. Azure spat and said that I was not supposed to (restricted). I still do not understand why it is not supposed to. Plato told me when I desperately complained to him about the stubbornness of Azure. Next went a little easier. All the machines and services that I used after that, I created only in one group and a subgrid initially available on our security account, and of course in the same region.
On the other hand, what I liked about Azure VM is that, unlike AWS, I have a choice: I can have SSH access to the machine using a private key, or using the traditional “username” - “password” pair. I am aware that AWS is probably a safer approach. But in some situations (such as this), when security issues temporarily do not matter, the account itself is temporary, the team members do not have sufficient qualifications, and there is simply no time to train them, in such cases it seems to me to exchange simple pairs: “ username ”-“ password ”more efficiently and reasonably safely.
So, I tried to raise MySQL on two VMs and open port 3306 outside, in order to make it easier to test and connect to the server. At the same time, I raised 2 or even 3 MySQL databases as an Azure service, hoping that the service should be easier and more convenient to use.
Meanwhile, Olivier Fortin approached Zvezdochka, as you remember, as a Web Accesibility Specialist. He inquired how our HTML code was up to standards. The asterisk vividly reacted to his question and showed everything she had at that time. Naturally, the code was far from being considered valid for WCAG 2.0. But for us, this standard was something completely abstract and not a major priority. Olivier suggested what exactly it is worth paying attention to and recommended installing utilities that people with limited abilities use to check how convenient it is to use our product using these utilities.
By 22:00 when we were asked to call it a day, I was in the middle of nothing. MySQL did not answer me anywhere and in any way. The mood was not very. And I was afraid that tomorrow I would really become a block for everyone. But thank God I'm not a boy. I calmly reasoned that Friday evening, the end of the working week, a lot of information, a little not full stomach, all this together can be the cause of my failures. And following the Russian proverb: the morning of the evening is wiser, you should go home and sleep peacefully, and tomorrow return with fresh strength and a fresh head.
Oh, by the way, I almost forgot, it seems Phaeton appeared at 8:30 pm or 9:00 pm, I don’t remember, because by that time I was completely immersed in tambourine experiments with Azure.
Second day. Saturday. Flour of creativity
Saturday schedule
Saturday March 24th
9:00 Arrival, breakfast, continued work in teams 12:00 Lunch, continued work in teams 18:00 Dinner, continued work in teams 22:00 End of the second day of competition
As you can see the second day in terms of the schedule was the most concise.
Work, work and work again.
Despite all my impatience on Saturday morning, I came after all. Well, not all, but the fourth of our team by 9:30. I needed to bring my younger son to a Russian school.
Why do we take children to a Russian school?
Since we want the child not to completely lose cultural ties with us and with his grandparents, aunts and uncles, sisters and brothers living far from us geographically, but always close to us in heart, we take the child every Saturday to the Russian school. By the way, it is the largest of Russian schools located outside Russia.
We know many examples when children arriving in Canada at a rather young age and / or born here, after how many years they have forgotten their native language. The fact is that many parents who come to Canada, in my opinion, focus too much on integration. I can understand them. Indeed, why move if not integrate into the local community. And trying to do their best for their children, they start speaking English or French with their family. And the children are built so that they very quickly learn a new language. The smaller the child, the faster it switches. And then the kindergarten and the school do their job.
In addition, extra classes on Saturdays or Sundays require additional efforts, money, time, attention. Someone does not have enough money, someone has time, someone just gets tired of telling and proving to the child why he needs it. And the children grow up here, let's call it “freedom loving”. And believe me, every Saturday to take back in the morning and pick up a child for 20 km in the evening, this is an additional burden that strains and breaks many plans for rare weekend days.
In addition to Russian language, literature and geography, it is believed that this school (like several other Russian schools in the region) provides a good additional base in mathematics and physics, which allows students to better assimilate the original Canadian curriculum in the main place of study in public schools.
The asterisk, by the way, also attended this school until entering college. And she has kept very friendly relations with many of her classmates from there. They are still regularly, even if not very often, by virtue of their employment, they are spent and spend time together.
I greeted my colleagues and prepared my workplace for work. Asterisk then escorted me to an improvised buffet where you could get coffee, juice and a cupcake for a dinner.
On the way, she shared her plans for the day. She said that they are in full swing with Mercury and I can not worry too much if they have any questions or problems - they will report. We are back at our table.
Plato informed me that he decided on the direction of movement and was completely immersed in work. Also in passing, he said that MySQL is working. I was surprised, but did not begin to understand what exactly and how it works. It was a little embarrassing what earned Plato, what I had to prepare.
In the meantime, I asked Asterisk and Mercury to show where they are moving in order to understand how far they are from the possible blocking point, when we need to connect all our parts together. The asterisk said that they are still working on Chrome Extension, which they easily managed to launch in the simplest version on the tutorial and now she understands how this browser extension can interact with the Radio Canada page.
Mercury said that he was working on a dynamic input field so that when the next key was pressed, a request was sent to the backend. Thus, it turned out that Mercury will already need my backend service in order to move on not blindly.
Leaving them to continue to work in the right and already a little more definite direction, I went up to chat with Plato.
Plato gave me a rather large and important piece of information.
Where are we going to Piglet?
From Radio Canada, we will use the Lineup public news API .
The T ext Analytics service will be used as the AI. Accordingly, after analyzing, in addition to the content of Radio Canada, we will have two additional parameters / fields of the database: SENTIMENT (emotional evaluation of the text: from 0 to 1, where 0 is negative text, and 1 is post-operative), KEY PHRASES (keywords and phrases). I simply took note of this information and tried to analyze what and how this changes in our project and how you can use these additional fields.
When I asked how Plato managed to raise the database, he said: so you raised yesterday. It turned out that yesterday, during my experiments, I regularly dropped the name of the VM and the services I was trying to launch, along with the users ’imans and Plato’s passwords on Skype. I did it so to say “in reserve” so that I would not look for a long time afterwards when I decided which service we would use. For his part, Plato tested one of them and the service answered him safely, i.e. DB responded as live. I asked Plato what service or VM he was referring to and which utility. It turned out that since the database world was almost new to Plato, he simply knocked on the specified service from Python following some tutorial found on the Internet.
On the one hand, it pleased me. It turned out that the service still raised me. On the other hand, it was not clear why I could not connect to it yesterday from my favorite PHPStorm. The explanation was simple, Plato connected to the service from the inside, as it were, inside the Azure network and in the same security group on an already running VM with its Python. And last night I tried to set up and test public access for everyone. That is, that goal, about public access, I never reached, but the database was available for work, and this is the main thing. So the option that earned us was MySQL as a service from Azure. The rest of the VMs raised the day before, I simply did not use and did not put out until the end of the hackathon, simply because there was no time.
Tik-so, my partners are all working, the clock is already 10-something long, and I don't even have a dev server with PHP.
How to get a server with PHP
This time for some reason, everything turned out right the first time.
I bring up a new VM Debian Jessie. It takes a couple of minutes (about the same amount it does not take AWS). I successfully connect via SSH to my old friend Putty, it pleases. I install PHP 7.2, it seems to be up. Install Composer, normal flight. I install composer global require "laravel/installer" , like there are no errors. Following further the official installation guide , I create a project Laravel laravel new api . Everything went smoothly. Just a couple of steps: I install Nginx, change the default config to point to the /public folder in the folder with my Laravel project and call php-fpm for php files. In Azure, I open access to port 8080 for my VM with PHP.
For warranty, I corrected the default welocome.blade.php to be sure that I see exactly my page and my project.
<divclass="title mb-md"> Hackathon 2018 Radio Canada </div>
I check in the browser on public IP of my virtualka and voila!
The first goal: do not block the work of the guys on the front end. It is necessary to create a simple controller and give them a pseudo answer, even if it is static. But at least they can already work out the interaction.
PHP first iteration
Create a new route in routes/web.php
Route::get('/search', 'SearchController@index');
and the new app/Http/Controllers/SearchController.php with the only method / action index yet
It was already some kind of progress, and I felt that, like the other members of the team, that I had joined the work. Anxiety for the fact that I find myself a weak link receded.
The asterisk by this time has begun to adapt the code for Web Accesibility. For starters, she set NVDA since last night to be able to test her changes. During this day, she really became, if not an expert, then a good expert on Web Accesibility, and I think that she, like myself, understood what the term “site semantics” means. When semantics becomes not just something theoretical, but it has very important practical value.
For comparison, here is the final HTML that was presented in the final version:
If you compare with the original version, then you will see that some tags
<div>
were replaced by more semantically understandable
<section><article>
and for tags
<img><a>
respectively added alt and title attributes. Prior to this hackathon, I attributed such modifications mostly to SEO and considered it to be excesses, with a somewhat incomprehensible applied purpose. Now I think the applied purpose of such practices will long remain in my perception of HTML code: these changes really help programs like NVDA to “read” the screen or web page in a more understandable, human language.
The next step in my plan was to create a service for Asterisk, which will replace its static data with my static data, which I will later replace with dynamic data from the database. I added one more way to the Laravel config
Route::get('/feed', 'SearchController@feed');
And accordingly another method in my SearchController
It was already the beginning of the 12th. I asked Asterisk to try to replace my data template with a call to my template. She was at that time mainly engaged in modifying HTML and modifying CSS. I want to note that after talking with Olivier, she refused to use bootstrap.css. I was a little disagree with this decision, but the lack of time did not allow to delve into the discussions. I decided everyone makes his piece in the way that he understands. So the asterisk tells me that the call to my service is not working! I rechecked several times in my browser - my service gave out json perfectly.
At first I got into the JS code, not really trusting Zvezdochka. As she did not apply for 100% knowledge of JS.
I was a little discouraged to find some code like this:
functionparseJson() { xml = new XMLHttpRequest(); xml.onreadystatechange = getData; xml.open("GET","http://.../feed/"); xml.send(); } functiongetData() { if(xml.readyState == XMLHttpRequest.DONE && xml.status == 200) { console.log(xml.responseText); jsonArray = JSON.parse(xml.responseText); fillInCards(); } else { console.log("There was a problem with connection"); } }
I'm not special in JS. But that's what I'm used to working with, if we are talking about AJAX, then it is mandatory jQuery or its analogs. I do not remember why I had such an opinion, but I somehow internally opposed to using raw JS for AJAX calls. Probably, due to my limited knowledge, I rely on libraries, which by default will do some actions for me about which I can forget or not know: open / close a connection, handle a non-standard event, call some handler, check / validate data before shipping and / or upon receipt.
For me, the code of the form would be more familiar:
But what to do, Asterisk insisted that its code was tested in many labs and projects that they had done in college and should work. I had to believe her. Moreover, after a small investigation, I found:
which seemed to me not quite logical. That is, the ajax request received a response from my service with a status of 200, but did not receive the response body and issued an error to the JS console: No 'Access-Control-Allow-Origin' header is. 5 minutes, I probably stupid trying to realize what is happening. Then it dawned on me that this seems to be the default security settings of Laravel, or rather even the installations of Chrome itself, which by default allow AJAX requests only for their domain / host.
A little googling I decided that I could just return the header 'Access-Control-Allow-Origin' along with the data, which should solve the problem. Having a little experimented my method has got the form:
After that, JS seems to have stopped swearing. It looked like the front and back end ligament was taking shape.
It was time for a second breakfast or first lunch, as you please, in the local just a lunch. The first to go eat a star with Mercury. And Plato and I, having briefly looked up from the code, discussed the current progress and the structure of the database table.Plato once again confirmed that the code for receiving news and sending them for analysis in AI is almost 90% ready.
Unfortunately, Plato created the repository after the hackathon, so it’s impossible to follow the development of its script in time. I will insert the entire script near the end of this part. In the meantime, I’ll just say that at the time before lunch Plato said that I can expect about 100-120 records to appear in the database shortly after lunch. I was quite satisfied with this so far, it was felt that the development was precisely moving on all fronts as planned. And with peace of mind, we also went to eat.
What feed knowledge workers
The food was satisfying, but weird. There was some type of porridge quinoa , , , , . . : , , - ( , , . — ! :-) ). , . . - , . - . , . - .
After lunch, while waiting for real data in the database, I focused on interacting with the front end. I asked Starlet to show what they do with integration with the Radio Canada page . She showed that all they managed to achieve at the moment is to clear all the original HTML, or the main block with content.
And my dreams and fantasies about trying to reuse the original CSS are very unlikely. With this quality of design, I did not see the point in trying to integrate into the original page of Radio Canada. It turned out that with such a clumsy integration, the page becomes ugly and prettier, as was intended, but rather the opposite, we just disfigure the original design with our inserts. With that, it is still unknown how the original JS and CSS will react.
After a brief discussion, everyone agreed to slightly change the concept of our prototype. I suggested discarding the idea of ​​integration, and since we have a browser plugin anyway, we can display the results directly in it. That is, directly below the field of the single line, display the panel on the right side of the browser window.
As time went.There was a hitch when Asterisk had to rewrite its cycles to make the number of categories equal to 2, and dynamically create sections for the category if the next item appeared new and add content to the existing section if the category was already received. Plus, the design for the right panel meant the output is no longer in 3 columns, but, just like a tape, into one.
Plato briefly stuck with the output of data in the database. As I noted earlier, this was for him the first example of real use of python from the database. The hitch occurred with the prepared statement. When Plato was tired of fighting with incomprehensible for him messages of python, he asked me to take a fresh look at what could be the problem. Here I was able to help, because the DB is my subject so to speak. Plato tried to modify the example from the Internet for the prepared statement. But used direct concatenation of values ​​and naturally the python swore. As soon as we separated the request from the data everything worked.
As soon as the first data appeared in the database, I replaced my data stubs with real data from the database. In this connection, the guys on the frontend immediately floated styles and elements. The headings of the categories, firstly, turned out to be too large, and secondly, it turned out that the data, with some kind of fright, includes randomly HTML markup.
Having discovered this, Asterisk began to redo the output of content in JS, and I naturally began to change it in myself. Since I thought that I was a supplier of content for the frontend, so even if I get dirty text from the database, I have to clean it up and return the text that has already been cleared of HTML garbage.
It’s good that in this case we didn’t need the body of the articles, as you see, the storage format quite unexpectedly includes not only HTML markup and CSS inline styles, but also calls to JS functions. I would not dare to call this API, and even more so a public API. But we have what we have for our prototype and we have enough to overcome the title and summary, for this we quickly created a function that cleans up HTML garbage and subsequently produces the necessary formatting of other data.
Formatting data from the database before sending it to the frontend
Time is ticking. All in work and debugging. All parts of the prototype look functional. The team feels an emotional lift.
But, as it was known from the very beginning, it is necessary to somehow tie the AI ​​to our search box!
So, we have “emotional evaluation” and “key words” that the almighty AI has returned to us. But how can we show the importance of these options when searching? Should we sort the search results by positivity rating? What to do with keywords? All the more so because of the amazing “dirty” content, the keywords look like this:
Pay attention to the keyword "nbsp" ;-) Although, we must pay tribute to AI, he did not seem to return the HTML tags as keywords.
On the move, it was decided that, no matter how we use these additional analysis data, I definitely need to give them along with each record, as you have already seen in the function just above.
For visualization, I suggested using a pseudographic progress bar of some characters. Just because I didn’t want the guys to spend time searching for some plug-ins, especially since, as I said, they worked with naked JS even without jQuery, so I doubted that they would be able to quickly find and tie something.
About the keywords came up with turning them into links hashtags to Twitter. At first it seemed that Twitter would not have content for any arbitrary word or phrase, but after several tests it turned out that the idea was quite workable.
Thus, at this stage, we decided that we would simply expand and enrich the search results with an assessment of positivity and hashtag links.
Somehow quickly and imperceptibly it was dinner time. Despite the fact that everyone was very immersed in work, we all gladly broke away from the code and went to the room where rations were brought to the participants :-) I don’t remember exactly the menu, but the dinner was cold again, which disappointed me a little. And as I said with the coffee, too, was the tension.
During dinner, we discussed plans for the end of the day, we still had much to debug. I suggested if we would have the opportunity to add filtering capabilities to the level of positivity in our plugin. Type if the user does not want to read the "bad" news today, he can raise the minimum level of positivity in the search and we will give him filtered results. The guys agreed to try, because their functions were divided. The asterisk was mainly involved in design and layout, and Mercury interacted with my PHP backend. And this part was, as it were, finished and worked, and we could quite manage to add one or two new requests.
With Plato discussed the need to process more data. He explained that the 100-120 records that we have already received are just a category. And that he has the opportunity to pull out a few more categories and, accordingly, drive everything through the AI ​​and unload it into the database. On that and decided.
So after dinner, we all went back to work. But I asked Star to buy coffee.
Do not think for advertising, but rather a national flavor
And when the asterisk returned with a magical hot napik we had a second wind.
Over the next 3-4 hours we worked intensively trying to create something nice. For a dynamic search, I made two different requests, depending on the number of search characters coming to my backend in the request. Since I knew that full-text search does not work well with short words, and simply ignores many letter combinations, I decided that I would use regular database search to search up to 3 characters LIKE, and when the search line gets longer, I will execute another query with MATCH.
This will allow us to avoid empty answers from the backend for a more beautiful presentation.
Do not judge strictly
publicfunctionsearchRange( $min, $max, $term){ $pdo = DB::connection()->getPdo(); if(strlen($term)>3){ $query = "SELECT * from articles WHERE ROUND(sentiment,1) >= :min AND ROUND(sentiment,1) <= :max AND images != '0' AND MATCH (title, summary, body, keyphrases) AGAINST (:term IN NATURAL LANGUAGE MODE) LIMIT 50;"; $stmt = $pdo->prepare($query); $stmt->bindValue('min',round(intval($min)/10,2)); $stmt->bindValue('max',round(intval($max)/10,2)); $stmt->bindParam('term',$term); } else { $query = "SELECT * from articles WHERE ROUND(sentiment,1) >= :min AND ROUND(sentiment,1) <= :max AND images != '0' AND (title LIKE :term OR summary LIKE :term1) LIMIT 50;"; $stmt = $pdo->prepare($query); $stmt->bindValue('min',round(intval($min)/10,2)); $stmt->bindValue('max',round(intval($max)/10,2)); $stmt->bindValue('term',"%$term%"); $stmt->bindValue('term1',"%$term%"); } $stmt->execute(); $res = $stmt->fetchAll(); $ret = []; foreach($res as $article) { $ret[] = $this->decorate($article); } return response()->json($ret,200)->header('Access-Control-Allow-Origin',"*"); }
But we all already experienced joy and, probably, even pride that the solution worked at all. We had a full-fledged prototype, which worked according to our idea. He met all the criteria announced by the organizers. This was more than enough for all of us.
At the very end of the day, Plato provided approximately 1,200 entries in a table across 80 categories / regions. For fluent testing and demonstration, the result looked at least working.
Completing the next part of my story, I want to apologize, first, for the delay (almost a month has passed since the publication of the second part), believe me, I sincerely tried to write faster. During this time I managed to lose my job, find a new one and study, albeit fluently, the ELK stack (Elasticsearch, Logstash, Kibana, Filebeat). I think the next article will be about Elastic.
When I started writing this 3rd part I was sure that it would be final. I overestimated my strength and underestimated the amount of material. Probably with experience, I will learn to express something more briefly. I ask you to take into account that in this case it is difficult to determine what exactly will interest the reader. This is still not quite a technical article, but about everything. But at the same time I want to show the technical side of what is happening.
Oh, I almost forgot, you can download and install our Chrome plugintrue only in developer mode. You can view the code before installation, thank God, there is practically nothing to read.
I will try very hard not to linger with the final part describing the final day, the projects of the finalists and the general conclusions that I made for myself.