Tuesday, January 14, 2014

Get Page Title and Image from Url using C# in ASP.NET

Hi,
Today's topic is about to fetch the title and image of an Url (i.e. http://facebook.com). Here I will do this operation with the help of  regular expression. For regular expression you can refer this link.

So, the main logic behind to fetch the title is to get the page source and from there find the <title> tag, and put that into a string.

Add two namespaces in your page.

using System.Net;                                     // for webclient
using System.Text.RegularExpressions;     // for regular expression


To get the whole page source the code is

string source = "";
WebClient x = new WebClient();
source = x.DownloadString("http://asp-arka.blogspot.in");  

Now from this source we will find the title of the site..

string title = "";
title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups["Title"].Value;

Now after execution this code you will get the value ASP With Arka.

Now its time to get the image file of the Url. In this code it will take the first <img> and return the value of  "src" against that <img> tag.

string img = "";
img = Regex.Match(source, "<img.+?src=[\"'](.+?)[\"'].+?>", RegexOptions.IgnoreCase).Groups[1].Value; 

We will get the output http://img2.blogblog.com/img/icon18_edit_allbkg.gif.
This image is the first image in the generated HTML of the Url.

Now the fetching the image is depend on the website content.

Hopefully you have enjoyed this article.

Download the source code here and don't forget to comment.

Download Now

0 comments:

Post a Comment

Popular Posts

Pageviews