How can I extract all contents of a website, not only a webpage? If we consider a website named, how can we get all of the contents from all of the page of this site? I have tested a code but it is to get the contents of a single page of a website only using C#.

string urlAddress = "";

        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        if (response.StatusCode == HttpStatusCode.OK)
            Stream receiveStream = response.GetResponseStream();
            StreamReader readStream = null;

            if (String.IsNullOrWhiteSpace(response.CharacterSet))
                readStream = new StreamReader(receiveStream);
                readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));

            string data = readStream.ReadToEnd();
  1. Create a list containing all the URLs that have already been scraped
  2. Create a loop that starts with a given URL, which is added to the URL list and then scrape the content of that page and search it for href tags (=new URLs). If the new URL is not in the list already repeat step 2 with this new URL. Go on as long as there are new URLs that have not been scraped yet.

Note, that you may want to check whether an URL is still on the same Domain, otherwise you might accidently scan the whole internet.

When you load that page in a browser, it will only get (server-sided browser switching aside) what you get with your request. What the browser then does and what you need to do in your code is parse this content - it contains references (e.g. via <script>, <img>, <link>, <iframe> and others) that will give the URLs of the other resources to load.

It might be easier to use a prebuilt application such as wget if it does what you need or use browser automation.

I have recently used the WatiN web application testing package to get website text using C#. WatiN was not the easiest package to get set up for

If you wants to Download a complete website including all of its contents, then you can use a software HTTrack.HTTrack allows users to download World Wide Web sites from the Internet to a local computer.Here is the link you can follow.

Suppose I have a table in webpage. Now I want to get all the html & css content then I want to put it in excel. I have already made it through webbrowser and I am using C#(.NET) when the data of table are contant.But the problem is that webbrowser doesn't support all the css and jquery function and my the data of table is not constant.

you can simple use the given code to read all webpage.

HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create (URL); myRequest.Method = "GET" ; WebResponse myResponse = myRequest.GetResponse (); StreamReader sr = new StreamReader (myResponse.GetResponseStream (), System.Text.Encoding.UTF8); string result = sr.ReadToEnd (); sr.Close (); myResponse.Close ()

ASP.NET Web Pages Layout, A layout page contains the structure, but not the content, of a web page. you a lot of work, since you don't have to repeat the same information on all pages. I have been running Windows 10 and Edge for sometime now overall happy with the OS not EDGE but I have been forcing myself to deal with it. Win 10 thumbs up - Edge less then stellar. was my home page and all of a sudden now Edge only displays half the page and that's it. I have no issues with IE or Chrome displaying anything.

HTTRACK works like a champ for copying the contents of an entire site. This tool can even grab the pieces needed to make a website with active code content work offline. I am amazed at the stuff it can replicate offline. This program will do all you require of it.