📜 ⬆️ ⬇️

Ratings Service / Online service + REST API for searching movie ratings

Ratings Service is an online service that allows you to find out the rating of a movie by its name.

image

Distinctive features:
  1. Search in several Internet bases at the same time (at the moment it is KinoPoisk.ru and KinoKopilka.ru)
  2. easy interface; can be used from a mobile phone, for example, through Opera Mini
  3. ability to receive sample results in XML format in REST style
  4. Google App Engine / Java Hosting
  5. Open source

I would like to talk about some of the features of the implementation and share my impressions of working with Google App Engine / Java. Below you will also find the address of the project on Google Code, with the source codes laid out.
')
It took me three days to write the service and it was really fun. It is fun, mainly because a lot doesn't work in GAE / Java, and some things had to be implemented from scratch, for example, to write a simple engine for XPath queries :)

But first things first.

Development tools


According to the Getting Started Guide from the IDE, only Eclipse is currently officially supported, for this there is the Google Eclipse Plugin. And three days ago only the version for Ganymede / 3.4 (I used it) was available, today the version for Galileo / 3.5 is already available.

Creating a GAE Web Application project is as simple as creating a regular Java project in Eclipse. Running and debugging the application works fine on the local machine. In addition to this, deploying one button makes development just enjoyable.

I also liked the nice little thing that the log4j.properties file was already in the project and configured as it should :) All I had to do was add the jar log4j to the classpath and enable debugging for my code:

 log4j.category.dmitrygusev.ratings = DEBUG, A1


Of the negative points (I don’t know where to put it - more likely to Eclipse), deploy stopped working ( GAE Issue # 1235 ) on the third day, but this was quickly fixed.

Another negative point is that the project directory structure in the GAE Web Application is different from the Maven2 directory structure. It is clear that you can customize the Maven2 project under Google App Engine, but I would like to see Maven2 support right in the Google Eclipse Plugin.

Frameworks & Libraries


As for frameworks and third-party libraries, most of them that really would be useful do not work in GAE / Java, mainly due to the fact that Google partially closed access to the Reflection API, as well as other APIs that could entail security risks.

Most of all, this project lacked support for XML frameworks (at least JAXB) and XPath. The standard Java XML API (DOM, SAX, StAX?) That is supported and which I used to use easily, now seems not very user friendly.

As a result, I decided to stop at the NanoXML Lite library, which showed itself well in the mobile client project for the online service www.4konverta.com .

The entire NanoXML Lite API consists of a single XMLElement class. Working with him is very simple, for example:

 XMLElement xml = new XMLElement ();
 xml.parseString ("<xml attr = 'value'> hello world </ xml>");

 xml.getChildren ();  // Returns Children
 xml.getAttribute ("attr");  // Returns attribute value, etc.


But there are two drawbacks that had to be fixed:
  1. Initially, NanoXML Lite does not support Mixed Content, which is often found in xhtml;
  2. NanoXML Lite, like any other, does not support the essence & nbsp; which made it difficult to build a system of xhtml templates (see below).


I had to write XPath support myself (112 lines of formatted code), in my opinion it turned out quite well:

 @Test
 public void testXPathSlashInAttributeValue2 () {
	 XMLElement xml = XMLUtils.createElementFromString ("<xml a = '.. / .. / .. / file.txt'> <value b = '0' /> </ xml>");
	
	 List <XMLElement> results = xml.find ("// [@ a = '.. / .. / .. / file.txt'] / value");
	
	 assertEquals (1, results.size ());
	 assertEquals ("0", results.get (0) .getAttribute ("b"));
 }

As for the Web frameworks, at first I wanted to use Tapestry5, because I know it well, but since the latest version (5.1) is not supported, just because of the limitations of the XML parser, I decided to refuse it. Those who are interested, the previous version - 5.0.18 - judging by the reviews, it works.

As a result, I decided to abandon the Web framework altogether and do everything on the Servlet API. To do this, I had to implement a couple of auxiliary classes that work with xml page templates, which were based on xhtml and modified NanoXML Lite.

So, for example, the code for processing an index page (see figure) looks like this:

 private void responseWithQueryForm (HttpServletResponse resp) throws Error,
		 IOException {
	 XMLElement indexXml = TemplateLoader.loadTemplate ("index-template.htm", "UTF-8");
	 XMLElement queryFormXml = TemplateLoader.loadTemplate ("query-form-template.htm", "UTF-8");

	 ResultsXhtmlRender.fixCssLinks (indexXml);
		
	 XMLElement queryDiv = indexXml.findById ("query-form");
	 queryDiv.setContent (null);
	 queryDiv.addChild (queryFormXml);

	 resp.setContentType ("text / html");
	 resp.setCharacterEncoding ("UTF-8");
	
	 resp.getWriter (). println ("<! DOCTYPE html PUBLIC \" - // W3C // DTD XHTML 1.0 Transitional // EN \ "\" http://www.w3.org/TR/xhtml1/DTD/xhtml1 -transitional.dtd \ ">");
	 resp.getWriter (). println (indexXml.toString ());
 }

How things work


The scheme of the service is very simple.

The query string entered by the user, the service, in turn, sends the KinoKopilka and KinoPoisk search engines (this is another GAE / Java restriction - multithreading cannot be used). That is what explains that the request execution time takes 5-10 seconds. In addition, data on ratings and some other details that are displayed in the search results are on other pages and are not displayed on the search pages of movie services, which also slows down the running time.

For parsing HTML pages, GAE / Java provides URLFetchService — a fairly convenient API (I advise you to look at the Low-Level API section), built on top of the Apache HTTP Client . Apache HTTP Client itself refused to work on GAE / Java.

Initially, I wanted to parse the results with my XML parser, but due to the fact that KinoKopilka and especially KinoPoisk produce non well formed xml, most parsing happens using regular expressions. Why is the majority - because the xml KinoKopilka is almost correct, there is one extra closing tag (I hope to correct it :), now I put the missing opening tag in handles and use NanoXML Lite + SimpleXPathService. (I didn’t just give a reference to xhtml / css validators at the end, I hope popular resources will use them sooner or later. It’s time for beginners to think about it now.)

The result is the Movie and Rating object model (there are only two entity classes in the subject area) containing the results of the parsing.

Further, depending on the required output format, this model is rendered either in xhtml format using a template system or using an xml render based on NanoXML Lite.

Examples of the execution of search queries are presented in the figures below:

image

image

Using


The service is available for public use, including the REST API, which you can freely use in your projects, as we do on planet33 , to receive additional information about the films.

I also had plans to write a mobile client for this API, but I found another solution that suits me perfectly, namely the use of xhtml + handheld CSS media type minimal block layout, which allows using the service in Opera Mini on my Sony Ericsson.

I also posted the source code of the project for review on Google Code under the GPL license, I hope they will be useful for you and you will learn something new for yourself.

The project structure is shown in the figure:

image

There are several warnings in the project (Warnings) - this is the work of the Google Eclipse Plugin, it says that I deleted unused jars from the WEB-INF/lib daddy ...

In order for you to have a search for “Kinopoisk”, you need to add the Config.properties file to src/dmitrygusev/ratings/utils/ with the following content:

 kinopoisk-username = your-kinopoisk-username
 kinopoisk-password = your-kinopoisk-password


Resources used


  1. Ratings Service on Google Code - code.google.com/p/ratings-service
  2. Google App Engine / Gettings Started: Java - code.google.com/appengine/docs/java/gettingstarted
  3. NanoXML Lite Sources - prdownloads.sourceforge.net/nanoxml/NanoXML-2.2.1.tar.gz
  4. Online CSS / XHTML validation from w3c, respectively, jigsaw.w3.org/css-validator and validator.w3.org
  5. CSS block layout - webmolot.com/css
  6. Mobile style - CSS Mobile Profile 2.0 - dev.opera.com/articles/view/mobile-style-css-mobile-profile-2-0
  7. Feedback service Reformal.ru - reformal.ru
  8. FAMFAMFAM Silk Icons (set of icons in PNG format, see asterisk image in favicon service) - www.famfamfam.com/lab/icons/silk

Source: https://habr.com/ru/post/66211/


All Articles