Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Web Technology by (20.3k points)

How do I use the HTML Agility Pack?

My XHTML document is not completely valid. That's why I wanted to use it. How do I use it in my project? My project is in C#.

1 Answer

0 votes
by (40.7k points)

First you need to install the HTMLAgilityPack nuget package into your project. 

Now, you can try doing as the below  example:

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();

// There are various options, set as needed

htmlDoc.OptionFixNestedTags=true;

// filePath is a path to a file containing the html

htmlDoc.Load(filePath);

// Use:  htmlDoc.LoadHtml(xmlString);  to load from a string (was htmlDoc.LoadXML(xmlString)

// ParseErrors is an ArrayList containing any errors from the Load statement

if (htmlDoc.ParseErrors != null && htmlDoc.ParseErrors.Count() > 0)

{

    // Handle any parse errors as required

}

else

{

 if (htmlDoc.DocumentNode != null)

    {

        HtmlAgilityPack.HtmlNode bodyNode = htmlDoc.DocumentNode.SelectSingleNode("//body");

        if (bodyNode != null)

        {

            // Do something with bodyNode

        }

    }

}

Note: The above code is just an example only and not necessarily the only approach. So, do not use it blindly in your own application.)

The HtmlDocument.Load() method accepts a stream that is very useful in integrating with other stream oriented classes in the .NET framework. While HtmlEntity.DeEntitize() is another important method for processing HTML entities correctly.

HtmlDocument and HtmlNode are the classes that you can use the most. Similar to an XML parser, it provides the selectSingleNode and selectNodes methods that accept XPath expressions. These control how the Load and LoadXML methods will process your HTML/XHTML.

There is also a compiled help file called HtmlAgilityPack.chm that has a complete reference for each of the objects. This is normally in the base folder of the solution.

Related questions

0 votes
0 answers
0 votes
1 answer
0 votes
1 answer
asked Feb 25, 2021 in Web Technology by dev_sk2311 (45k points)
0 votes
1 answer
asked Dec 6, 2020 in Web Technology by dev_sk2311 (45k points)
0 votes
1 answer
asked Aug 18, 2020 in Web Technology by Sudhir_1997 (55.6k points)

Browse Categories

...