This code is supposed to remove the a tag within the span within each div, but this fails on the second iteration ("ARGH!!") Is there something wrong with the remove function?
$html = str_get_html("
<div>
<span>
<a>DELETE</a>KEEP
</span>
</div>
<div>
<p>HELLO</p>
<span>
<a>DELETE</a>KEEP
</span>
</div>
");
foreach($html->find("div") as $item) {
$foo = $item->find("span", 0);
if($a = $foo->find("a", 0))
$a->remove();
else
echo "ARGH!!\n";
}
No... slowly walks away
Dammit, I was hoping to find a solution before anyone noticed.
The remove function currently takes out elements from an array, which obviously makes it impossible to use indices to iterate over said array, unless you re-index the array using
array_values.However, unfortunately each node also stores the index of the closing element from the array, which means we can't simply use
array_valuesas all nodes must be updated as well or we need a different solution for indexing.This will probably take a while to find a solution for, sorry.
Noted. So
remove()must be used with caution, or not at all, until further notice.Yes, sorry about that. I'm currently working on getting better insight over the DOM tree to figure out a better solution. Here is an example:
The document
The tree before removing
.removeThe tree after removing
.removeYou can see that nesting and indices (numbers on the left) are incorrectly aligned. (the
paftertext: "Hello, World!"is one level too low. This totally breaks find operations after removal.You can still use the remove function but you need to save and load afterwards until this is fixed.
Your issue as well as a few others that weren't detected before are fixed in master. Let me know if you find more issues related to this.
I can confirm that my enclosed example doesn't scream "ARGH!!" anymore. Sounds like you had a good understanding of the problem. I'll let you know if I run into similar issues.
Hi, I faced the same issue ( 1.9-dev branch ).
I fixed it locally and then compared it with the master branch.
As I see it is fixed with
array_intersect()but there is one morearray_slice()call that will cause the same problems in some cases (for selectors with~).So it's needed to be changed too. :)
Last edit: atomeek 2020-03-23
Can you provide an example that reliably fails in current master and is fixed by this change?
I would like to add more tests for this to ensure it doesn't break in the future.