Introduction
In this tutorial I will use standalone Java SE parser called Jsoup. We have few jsoup files available:
In android world size matter so I want only my jsoup-1.6.1-sources.jar. So far jar is an archive so I use 7-zip to extract content of jsoup-1.6.1-sources.jar into folder. Next thing is to create Android Application from Eclipse IDE.
After extracting sources from jsoup you should copy all directories inside your src folder. You should have similar application structure:
public class HtmlAParseActivity extends Activity { EditText text1; Button btn1; /** Called when the activity is first created. */ @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); text1 = (EditText) findViewById(R.id.editText1); btn1 = (Button) findViewById(R.id.button1); btn1.setOnClickListener(new OnClickListener() { @Override public void onClick(View v) { // TODO Auto-generated method stub Document doc; try { doc = Jsoup.connect(text1.getText().toString()).get(); Elements links = doc.select("a[href]"); Elements media = doc.select("[src]"); Elements imports = doc.select("link[href]"); print("\nMedia: (%d)", media.size()); for (Element src : media) { if (src.tagName().equals("img")) print(" * %s: <%s> %sx%s (%s)", src.tagName(), src.attr("abs:src"), src.attr("width"), src.attr("height"), trim(src.attr("alt"), 20)); else print(" * %s: <%s>", src.tagName(), src.attr("abs:src")); } print("\nImports: (%d)", imports.size()); for (Element link : imports) { print(" * %s <%s> (%s)", link.tagName(), link.attr("abs:href"), link.attr("rel")); } print("\nLinks: (%d)", links.size()); for (Element link : links) { print(" * a: <%s> (%s)", link.attr("abs:href"), trim(link.text(), 35)); } } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } }); } private static void print(String msg, Object... args) { System.out.println(String.format(msg, args)); } private static String trim(String s, int width) { if (s.length() > width) return s.substring(0, width - 1) + "."; else return s; } }Of course don't forget to add permission for internet connection.
<uses-permission android:name="android.permission.INTERNET" /> <uses-permission android:name="android.persmission.ACCESS_NETWORK_STATE" />
Code in HtmlAParseActivity was from org.jsoup.example package with some modifications.
Thats all.