
The first time I launched Eclipse in the spring, read books in English, installed the SDK, played a little and abandoned it. At the beginning of winter, I bought myself the first Android-based smartphone, but once again I was pushed back to the development by a recent post, which said that you can do with the knowledge of C #, which, unlike Java, I am familiar with. One evening was enough for me to understand that I would not sit down for a bunch of Visual Studio and Monodroid, I later read this
post , where I fully agreed with the author.
After a short introduction, I’ll move on to the topic topic. Quite a large number of applications for mobile devices interact with sites and it is no secret that sometimes you need to get some information from the page - this can be the exchange rate or something else, and there is no desire to do this through a browser.
')
Most developers get the html code of the page and overtake it in xml, which is the wrong approach, because html is the “correct” xml is not always, like they wrote in Habré that the html tag is not required for the browser (a modern browser should display the page without it ) or there will simply be errors, then
libraries come to the rescue. Of these, I chose
HtmlCleaner .
Under the cut, I'll tell you how to connect this library, and also write a simple parser at stackoverflow.com.
I will not tell you how to install the Android SDK, Eclipse and ADT Plugin, if these words do not tell you, then visit these two links:
Installing the SDKADT Plugin for EclipseThe main page on stackoverflow.com is as follows:

I will parse the information highlighted by the red rectangles.
Everything is designed for beginners, so there will be a lot of pictures. At this stage, you should have a fully configured Eclipse, to create a project, click File -> New -> Project ... and select the Android Project, then fill out the form:

I am writing for my device, so I chose version 2.2, the second important parameter is the package name, which should be unique, it is accepted that this is the name of the site on the contrary, plus the name of the application. Tests will not create, so feel free to click Finish. A project has been created, I recommend that you study which files and where they are, but I’ll say from my own experience that I was immediately scared of the number of files that appeared when Eclipse first started.
Let's start editing the res \ layout \ main.xml file, here I’ll delete the TextView and add two controls: Button and ListView, change the identifiers, install the android for the button: layout_width = "fill_parent" and android: text = "Get data". The finished result looks like this:
<? xml version ="1.0" encoding ="utf-8" ? >
< LinearLayout xmlns:android ="http://schemas.android.com/apk/res/android"
android:orientation ="vertical"
android:layout_width ="fill_parent"
android:layout_height ="fill_parent" >
< Button android:id ="@+id/parse"
android:layout_height ="wrap_content"
android:layout_width ="fill_parent"
android:text =" " >
</ Button >
< ListView android:id ="@+id/listViewData"
android:layout_height ="wrap_content"
android:layout_width ="match_parent" >
</ ListView >
</ LinearLayout >
* This source code was highlighted with Source Code Highlighter .
This is the simplest interface, in case you make an application and decide to publish it in the market, you must change it, put the same background through android: background = "@ drawable / File_No_expansion", etc.
For parsing we need to download the
htmlcleaner-2.2.jar library , then it should be added by adding to Build Paths. A good manual how to do this can be found
here if you have any difficulties.
First of all, you need to specify that our application needs the Internet, otherwise it will not work for you, we will add to the AndroidManifest.xml file:
< uses-permission android:name ="android.permission.INTERNET" />
Now we create the HtmlHelper class, which will do the main work:
public class HtmlHelper {
TagNode rootNode;
//
public HtmlHelper(URL htmlPage) throws IOException
{
// HtmlCleaner
HtmlCleaner cleaner = new HtmlCleaner();
// html
rootNode = cleaner.clean(htmlPage);
}
List <TagNode> getLinksByClass( String CSSClassname)
{
List <TagNode> linkList = new ArrayList <TagNode>();
//
TagNode linkElements[] = rootNode.getElementsByName( "a" , true );
for ( int i = 0; linkElements != null && i < linkElements.length; i++)
{
//
String classType = linkElements[i].getAttributeByName( "class" );
// ,
if (classType != null && classType.equals(CSSClassname))
{
linkList.add(linkElements[i]);
}
}
return linkList;
}
}
In the main class, we set the listener for the button and call asynchronously parsing using
AsyncTask , I immediately did using stream creation and then updated the interface through the handler, but I read that this is not the best solution and AsyncTask is better for this purpose that the application is running, I will call a dialog that will inform about the process. Actually the main class is as follows:
public class StackParser extends Activity {
/** Called when the activity is first created. */
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
//
Button button = (Button)findViewById(R.id.parse);
// onClick
button.setOnClickListener(myListener);
}
//
private ProgressDialog pd;
// OnClickListener
private OnClickListener myListener = new OnClickListener() {
public void onClick(View v) {
//
pd = ProgressDialog.show(StackParser. this , "Working..." , "request to server" , true , false );
//
new ParseSite().execute( "http://www.stackoverflow.com" );
}
};
private class ParseSite extends AsyncTask< String , Void, List < String >> {
//
protected List < String > doInBackground( String ... arg) {
List < String > output = new ArrayList < String >();
try
{
HtmlHelper hh = new HtmlHelper( new URL(arg[0]));
List <TagNode> links = hh.getLinksByClass( "question-hyperlink" );
for (Iterator<TagNode> iterator = links.iterator(); iterator.hasNext();)
{
TagNode divElement = (TagNode) iterator.next();
output.add(divElement.getText().toString());
}
}
catch (Exception e)
{
e.printStackTrace();
}
return output;
}
//
protected void onPostExecute( List < String > output) {
//
pd.dismiss();
// ListView
ListView listview = (ListView) findViewById(R.id.listViewData);
// doInBackground
listview.setAdapter( new ArrayAdapter< String >(StackParser. this ,
android.R.layout.simple_list_item_1 , output));
}
}
}
If you did everything with me, then you should have obtained the following file hierarchy:

And after starting the application should look like this:

Download link:
applicationConclusion
We learned that there are quite a lot of libraries for parsing, got acquainted with one of them, wrote an application that parses the site in the background and, when ready, shows us the result of its work. In principle, it can be further developed, after which it is possible that it will become popular in certain circles, the first thing that comes to mind is when you click on a question to open this question in a new window through WebView.