PHP: xml_parser "Mismatched tag"-error when parsing HTML (auto-closing tags as <img>)? -


I want to parse the HTML using PHP I have used xml_parser for this, but this is and lt; Img & gt; As the can not cope with the auto-completion tag. For example, the following HTML snippet is a 'mismatched tag' when this closing tag & lt; / A & gt; :

  & lt; A & gt; & Lt; Img src = "url" & gt; & Lt; Br> & Lt; / A & gt;  

Obviosly, the reason for this is: xml_parser () does not know the tag & lt; Img & gt; and & lt; Br>

I know that I get the HTML to & lt; Img src = "URL" /> I can rewrite for HTML; Br / & gt;

To make the parser happy, however, I want the parsets to correctly process those HTMLs because the above differences will be valid HTML.

So I need to tell the parser either within the opening tag - if this tag is automatically closing, is it possible in some way? Parsers can be an option to give a list of self-closing tag names. However, I did not find any function for this. Therefore, it may also happen that 'HTML' is not supported by this parser.

An acceptable solution may be to disable the tag mismatch check at all (or to apply the HTML-compliant version)

However, a HTML-specific PHP The version may be what I ignored. Any suggestion that I can use for implementing other common parsers?

Even I have so far:

     $ dom = new DOMDocument (); $ Dom-> loadHtml ($ HTML); $ Listings = $ dom- & gt; GetElementsByTagName ('ul'); // bla bla bla  

My suggestion is to try a special library for HTML parsing. Here are some suggestions:

  • Maybe force is with you !


    Comments

    Popular posts from this blog

    mysql - How to enter php data into a html multiple select box -

    java - Can't add JTree to JPanel of a JInternalFrame -

    c++ - Cassandra datastax cpp driver - avoiding unnecessary copies -