What is the xpath for the container of this HTML list?

xpath contains
xpath cheat sheet
xpath tester
xpath tutorial
xpath examples
xpath contains(text)
xpath html
xpath attribute value

Being unfamiliar with JUnit, I'm interested in scraping data from a website -- at least for now.

I see that the fragment extends a base class:

import org.junit.Test;

public class PageFragmentsExampleTest extends TestBase {

    @Test
    public void shareSecondPost()  {
        FacebookSportPostsPage facebookPage = FacebookSportPostsPage.open();
        FacebookPostFragment secondPost = facebookPage.getPostByIndex(2);
        secondPost.share();
    }

    @Test
    public void sharePostFromDate()  {
        FacebookSportPostsPage.open().getPostByText("April 16 at 7:35am").share();
    }

}

but how is that fragment used? It seems that the container is passed to the fragment constructor.

What is the container for the books catalogue?

Using inspect element I get an xpath of /html/body/div/div/div/aside/div[2]/ul/li/a for the "Books" link.

But, this is very different from the sample xpath String of

"//*[contains(text(),'%s')]//ancestor::div[@class='%s']", (text, POST_CONTAINER_CLASS)`

What is the xpath for the "Books" catalogue container?

A good option is to use css selector which is reliable then xpath.

For Books link

  1. .nav.nav-list css class points to all Books category. In css to use class name we use . before name of each class
  2. move to first li tag using >. This means next inner child node.
  3. move to first anchor a tag which is Books link

    driver.findElement(By.cssSelector(".nav.nav-list>li>a")).click();
    

Xpath cheatsheet, ul > li:first-child, //ul/li[1] ? ul > li:nth-child(2), //ul/li[2]. ul > li:last-child, //ul/li[last()]. li​#id:first-child, //li[@id="id"][1]. a:first-child, //a[1]. a:last-child, //a[last()]  Results I’m looking for start on the highlighted row. This table contains rows with all scores I want to crawl so this is the first container we’ll fetch using XPath.

If you want to get <div class="side_categories"> matched by XPath like

"//*[contains(text(),'%s')]//ancestor::div[@class='%s']", (text, POST_CONTAINER_CLASS)

you can try

"//*[normalize-space()='Books']//ancestor::div[@class='side_categories']"

Selecting content on a web page with XPath, Note that HTML and XML have a very similar structure, which is why XPath can be used For example, looking at the list of blockquote elements returned by the​  XPath is a major element in the XSLT standard. XPath can be used to navigate through elements and attributes in an XML document. XPath stands for XML Path Language XPath uses "path like" syntax to identify and navigate nodes in an XML document

Xpath for link book categories

//li/ul/li/a

Xpath for book items

//ol/li

Professional ASP.NET 2.0 Databases, The DataList control is used to display a repeated list of items that are bound to the control. However, the Page Language="C#" %> <html Xmlns="http://www.​w3.org/l999/Xhtml"> <head> Xml" XPath="bookstore/genre Select(Container. XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps. The most useful path expressions are listed below:

Using XPath to select elements – ParseHub Help Center, XPath is a language that lets you select particular HTML elements from a Click on the Select or Relative Select in the list of commands in your project to on the page instead of within the container that you are currently in. There are some situations when regular XPath cannot be used to find an element. In such situation, we need different functions from the xpath query. There some important XPath functions like contains, parent, ancestors, following-sibling, etc. With the help of these functions, you can create complex XPath expressions.

Selenium WebDriver Recipes in C#: Second Edition, Generated HTML Source <div class="chosen-container chosen-container-single tabindex="2"> </div> <ul class="chosen-results"> <li class="active-result" style=​"" With that knowledge, plus XPath in Selenium, we can drive a Chosen  The w3-container class adds a 16px left and right padding to any HTML element. The w3-container class is the perfect class to use for all HTML container elements like: <div>, <article>, <section>, <header>, <footer>, <form>, and more.

XPath, XLink, XPointer, and XML: A Practical Guide to Web , List of Figures 1.1 Basic linking components 9 1.2 Implementing a multiple-​source link in HTML 16 2.1 Simple Web link 26 2.2 Dynamic preceding-sibling, and self 109 6.1 Snapshot of W3C's technical reports page 140 6.2 Container nodes,  HTML Description Lists. HTML also supports description lists. A description list is a list of terms, with a description of each term. The <dl> tag defines the description list, the <dt> tag defines the term (name), and the <dd> tag describes each term:

Comments
  • What do you mean by "Books" catalogue container? What is your target element?
  • Learning the terminology here. I think that the target element would be the list itself, if I understand correctly. I'd like to "pull" out the relevant html, and pass only that fragment to a handler class (?) specific to that chunk.
  • I've heard of speed advantages for css selector, but wasn't aware it could be more reliable as well. Why more reliable?
  • @Thufir - I said css are more reliable then xpath, its a comparison. The reason is xpath is the hierarchy of HTML page in XML nodes structure, which changes frequently so making xpath more prone to failure, where as css are web page elements styling classes whose names does not change to frequently, making them more reliable.
  • @AmitJain , absolutely not :) I guess you're talking about absolute XPath, but actually relative XPath is more powerful and definitely not less reliable than CSS-selector
  • weird usage for "class", but okay. Leaving open until I get cracking on writing actual code.
  • @Thufir , if you want to select unordered list (ul) with book categories, try //div[@class="side_categories"]//ul[not(@class)]
  • beautifully simple. My speed ;)