simple html dom php

  2022-05-26
php 采集相关。
#$Rev: 194 $ [Updates - add some ability to insert and create nodes.] [1: add ability to search the "noise" array] [PHP Simple HTML Dom version 1.5 released.] 1: Memory leak fixed! 2: Added support for detecting the source html character set. This is used to convert characters when plaintext is requested. 3: Other little fixes and features, too numerous to categorize. [On going] 1. Error of "file_get_contents()" will be thrown as an exception. 2. Add flag: LOCK_EX while calling "file_put_contents()". 3. Fix the typo of "token_blank_t". [PHP Simple HTML DOM Parser v1.11 is released] 1. Supports xpath generated from Firebug. 2. New method "dump" of "simple_html_dom_node". 3. New attribute "xmltext" of "simple_html_dom_node". 4. remove preg_quote on selector match function: [attribute*=value]; 5. Element "Comment" will treat as children. 6. Fixed the problem with <pre>. 7. Fixed bug #2207477 (does not load some pages properly). 8. Fixed bug #2315853 (Error with character after < sign). [PHP Simple HTML DOM Parser v1.10 is released] 1. Negative indexes supports of "find" method, thanks for Vadim Voituk. 2. Constructor with automatically load contents either text or file/url, thanks for Antcs. 3. Fully supports wildcard in selectors. 4. Fixed bug of confusing by the < symbol inside the text. 5. Fixed bug of dash in selectors. 6. Fixed bug of <nobr>. 7. Fixed bug #2155883 (Nested List Parses Incorrectly). 8. Fixed bug #2155113 (error with unclosed html tags). [PHP Simple HTML DOM Parser v1.00 is released] 1. New method "getAllAttributes" of "simple_html_dom_node". 2. Fix the bug of selector in some critical conditions. 3. Fix the bug of striping php tags. 4. Fix the bug of remove_noise(). 5. Fix the bug of noise in attributes. 6. Supports full javascript string in selector: $e->find("a[onclick=alert('hello')]"). 7. Change selector "*=" to case-insentive. [PHP Simple HTML DOM Parser v0.99 is released] 1. Performance turning (boost 10%). 2. Memory requirement reduce 25%. 3. Change function name from "file_get_dom()" to "file_get_html()". 4. Change function name from "str_get_dom()" to "str_get_html()". 5. Fixed bug #2011286 (Error with unclosed html tags). 6. Fixed bug #2012551 (Error parsing divs). 7. Fixed bug #2020924 (Error for missed tag.). 8. Fixed bug (problem with <body> tag's innertext). [PHP Simple HTML DOM Parser v0.98 is released] 1. Performance turning (boost 20%). 2. Supports "multiple class" selector feature: <div class="a b c"></div>. 3. New "callback function" feature. 4. New "multiple selectors" feature: $dom->find('p,a,b'); 5. New examples. 6. Supports extract contents from HTML features: $dom->plaintext; 7. Fix the bug of $dom->clear(). 8. Fix the bug of text nodes' innertext. 9. Fix the bug of comment nodes' innertext. 10. Fix the bug of decendent selector with optional tags. 11. Change simple_html_dom_node method name from "text()" to "makeup()". [PHP Simple HTML DOM Parser v0.97 is released] 1. Important!! file and class name changed (html_dom_parser->simple_html_dom)! 2. Important!! ($dom->save_file) will not support anymore. 3. New node type "comment" (eg. $dom->find('comment')). 4. Add self-closing tags: 'base', 'spacer'. 5. Fix the bug of outertext (th). 6. Fix the bug of regular expression escaping chars ($dom->find). 7. Fix the bug while line-breaker and "\t" in tags. 8. Remove example "example_customize_parser.php". 9. New example "simple_html_dom_utility.php". [PHP Simple HTML DOM Parser v0.96 is released] 1. (Request #1936000) New DOM operations(first_child, last_child, next_sibling, previous_sibling). 2. New method to remove attribute. 3. Add the solution while server behind proxy in FAQ (Thanks to Yousuke Shaggy). 4. Add traverse section in manual. 5. Now file_get_dom supports full file_get_contents parameters. 6. Fix the bug of self-closing tags in the end of file. 7. Fix the bug of blanks in the end of tag. 8. Add Reference section in manual. #. Fix some typo of testcase. [PHP Simple HTML DOM Parser v0.95 is released] 1. New attribute filters (Thanks to Yousuke Kumakura). 2. Fix the bug of optional-closing tags. 3. Fix the bug of parsing the line break next to the tag's name. 4. Supports tag name with namespace. #. Refine structure of testcase. [PHP Simple HTML DOM Parser v0.94 is released] 1. Stop infinity loop while tthe source content is BAD HTML. 2. Fix the bug of adding new attributes to self closing tags. 3. Fix the bug of customize parser without $dom->remove_noise(); 4. Add FAQ section in manual.