As the text says, I have a long html code, approximately 1000 lines, where there are lots of and , here is an example
<td class="textleft"><a href="/DIRECCION-URL.html">TEXTO DE LA URL</a></td><td>DESCRIPCION</td><td class="mobile-hidden">POBLACION</td></tr>
That pattern is repeated about 100 times in the entire html code, and from that code I would need to be able to literally extract this --> /URL-ADDRESS.html on the one hand, and URL TEXT on the other hand, into an associative array of course.
I have done several tests with preg_match_all in php but the only one that has worked for me returns values with http that are just the ones I want to omit.
I must admit that this code that I put is copied and that I have simply messed with it a bit to try to adapt it to what I need, but I cannot get it out
preg_match_all('#/[^,\s()<>]+(?:([\w\d]+)|([^,[:punct:]\s]|/))#', $htmlcontent , $results);
With xpath you can achieve it
with
//td[@class='textleft']/a/@href
you get all the values of the href attribute (@href
), within all the links (a
), within all the cells [td
] that if the class attribute is 'textleft'[@class='textleft']
with
//td[@class='textleft']/a/text()
you get the text(/text())
inside all the links (/a
), inside all the cells (td
) that its class attribute is 'textleft'[@class='textleft']