Simple html dom file_get_html not working - is there any workaround?

simple html dom get attribute value
call to undefined function file_get_html()
sunra/php-simple-html-dom-parser
online html dom parser
simple html dom get table
simple html dom get next element
simple html dom php 7 download
simple dom
<?php
// Report all PHP errors (see changelog)
error_reporting(E_ALL);

include('inc/simple_html_dom.php');

    //base url
    $base = 'https://play.google.com/store/apps';

    //home page HTML
    $html_base = file_get_html( $base );

    //get all category links
    foreach($html_base->find('a') as $element) {
        echo "<pre>";
        print_r( $element->href );
        echo "</pre>";
    }

    $html_base->clear(); 
    unset($html_base);

?>

I have the above code and I'm trying to get certain elements of the Play Store page but it isn't returning anything. Is it possible that certain PHP functions might be disabled on the server to stop that?

The above code works perfectly on other sites.

Is there any workaround?

As I said, your example is working fine for me... But try this way using curl instead:

//base url
$base = 'https://play.google.com/store/apps';

$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);

// Create a DOM object
$html_base = new simple_html_dom();
// Load HTML from a string
$html_base->load($str);

//get all category links
foreach($html_base->find('a') as $element) {
    echo "<pre>";
    print_r( $element->href );
    echo "</pre>";
}

$html_base->clear(); 
unset($html_base);

It gets all the links as expected:

And make sure you have php_openssl and php_curl installed...

Simple html dom file_get_html not working, As I said, your example is working fine for me But try this way using curl instead: //base url $base = 'https://play.google.com/store/apps'; $curl = curl_init();  @MichaelButler Hello Micahel thank you for your response, file_get_html() is a function of PHP simple html dom parser, and I am not using any proxy server so I guess it wouldn't be problem related to proxy.

remove the semicolon from php.ini and restart Apache server to enable php module configuration

; Windows Extensions
...
;extension=php_openssl.dll
...

file_get_html returns null, but cURL should fix it [PHP], I have been using file_get_html function of Simple HTML DOM parser to Before I explain why file_get_html() does not work, have a look at my  Normally, it would be silly. Except perhaps you can't use either file_get_contents or file_get_html because of limitations (eg allow_url_fopen is off) In which case you are going to probably collect the string with curl and then use str_get_html. Not really the OPs question but it might help someone . – niccol May 16 '19 at 22:03

You must set "allow_url_fopen" as TRUE in "php.ini" to allow accessing files via HTTP or FTP. Some hosting venders disable PHP's "allow_url_fopen" flag for security issues.

FAQ, The above code works perfectly on other sites. Is there any workaround? As I said, your example is working fine for me But try this way using curl instead: Been looking everywhere on how to “Check JS code Snippet” installation, as it happens I’m using HTML DOM Parser for other tasks like checking forms POST endpoints, I know there are other server side techniques to validate a JS snippet installation, but the snippet of code is for a third party tool, just wanted to check if a certain web page contains that snippet or not, any thought on

$post = curl_init(); 
curl_setopt($post, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($post, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($post, CURLOPT_HEADER, 0);
curl_setopt($post,CURLOPT_RETURNTRANSFER, true);
curl_setopt($post,CURLOPT_URL,$website);
curl_setopt($post,CURLOPT_POST,1);
curl_setopt($post,CURLOPT_POSTFIELDS,"regno=$Number");
curl_setopt($post, CURLOPT_FOLLOWLOCATION, True);
curl_getinfo($post, CURLINFO_HTTP_CODE);
$curlresponse = curl_exec($post);
curl_close($post);  
$dom = new DOMDocument();
$dom->loadHTML($curlresponse);

DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseStartTag: misplaced THIS IS URL : http://www.annauniv.edu/cgi-bin/result/cgrade.pl?regno=11210104001

Get part of whole page with file_get_html and PHP Simple HTML , Problem with finding A: The "file_get_dom" function is a wrapper of " file_get_contents" function, $html= file_get_html('http://www.php.net', false, $ context); I have the impression this is a workaround but not a solution. If you access your site via github.com, you want the content of readme.md displayed, so people know, what this project is about. If people access your website via github.io, you want the content of index.html displayed. I was under the impression, that the right way to do this, is

PHP Simple HTML DOM Parser returning false on valid url, r/PHPhelp: Post specific problems or questions you have about PHP or your code . Hopefully EDIT: I have just realised that you are already using a library: PHP Simple HTML DOM Parser. Removing those lines should fix it. level 2. Same origin policy for accessing DOM. A webpage inside an iframe/frame is not allowed to modify or access the DOM of its parent or top page and vice-versa if both pages don’t belong to same origin. There are three ways of bypassing this restriction. window.document.domain variable manipulation; Proxy; Cross Document Messaging

PHP simple_html_dom.php on a normal HTML page Solutions , This seems to be a duplicate of this problem: Simple HTML DOM returning false Also as as you found out you can work around the file size check with doing this DOM returning false, you should not use the default function file_get_html for  JavaScript is not as permissive as HTML and CSS however — if the JavaScript engine encounters mistakes or unrecognized syntax, more often than not it will throw errors. There are a number of modern JavaScript language features defined in recent versions of the specs ( ECMAScript 6 / ECMAScript Next ) that simply won't work in older browsers.

PHP Simple HTML DOM Parser / Bugs / Search, But the PHP below which is an include plus extra PHP code doesn't work on a HTML page (but <?php include('simplehtmldom/simple_html_dom.php'); // get DOM from URL or file $html = file_get_html('news-info-school-status.html'); // find all div tags It runs simple PHP includes OK but not this example labelled as . html. The Document Object Model (DOM) represents the structure of a web document as a tree of nodes. When an HTML file is loaded into a browser, the browser interprets the HTML and displays the document in a window. The following diagram shows a simple HTML file and the resulting web browser page in Chrome. HTML uses tags to describe the document.

Comments
  • Working fine for me tho...
  • wow thank you, as you said, I just needed to activate the "php_openssl" extension and it works now :) I'm using WAMP Server on windows and it was inactive by default. Thanks man!