如何打印xpath解析器
问题描述:
我在解析Python 2.7/Flask 0.12中的lxml时遇到了一些麻烦。如何打印xpath解析器
我想打印解析的xpath数据,但找不到一种方法。 这是我的代码。
from lxml import html
from lxml import etree
def parse_html(html_src):
target_xpath = //div[@class="primary-content"]//div[@class="mini-cart-product clearfix"]'
detail_html = html.fromstring(html_src)
page_tree = etree.ElementTree(detail_html)
target_value_list = page_tree.xpath(target_xpath)
#want to print 'target_value_list'
return target_value_list
这里的html_src
<div class="mini-cart-product clearfix">
<div class="mini-cart-image">
<a href="/carters-baby-boy-one-pieces/190795419986.html"><img src="https://www.carters.com/dw/image/v2/AAMK_PRD/on/demandware.static/-/Sites-carters_master_catalog/default/dw540ec9a5/hi-res/127G525_Default.jpg?sw=470" alt="Neon Little Brother Jumpsuit" title="Neon Little Brother Jumpsuit"></a>
<div class="mini-cart-brand">
<div class="carters"></div>
</div>
</div>
<div class="mini-cart-attributes">
<div class="product-name">
<a href="/carters-baby-boy-one-pieces/190795419986.html">Neon Little Brother Jumpsuit</a>
</div>
<div class="attribute Size">
<span class="label">Size:</span>
<span class="value">
9M
</span>
</div>
<div class="verticalLine">|</div>
<div class="attribute Color">
<span class="label">Color:</span>
<span class="value">
Blue
</span>
</div>
<div class="minicartpricedisplay">
<div class="price">
<span class="MSRP price-standard">
<span class="msrp">MSRP*:
$14.00
</span>
</span>
<span class="price-standard ">$6.00</span>
</div>
</div>
</div>
</div>
<div class="mini-cart-product clearfix">
<div class="mini-cart-image">
<a href="/carters-baby-boy-one-pieces/190795039832.html"><img src="https://www.carters.com/dw/image/v2/AAMK_PRD/on/demandware.static/-/Sites-carters_master_catalog/default/dw182a85c8/hi-res/118H023_Default.jpg?sw=470" alt="Piqué Polo Romper" title="Piqué Polo Romper"></a>
<div class="mini-cart-brand">
<div class="carters"></div>
</div>
</div>
<div class="mini-cart-attributes">
<div class="product-name">
<a href="/carters-baby-boy-one-pieces/190795039832.html">Piqué Polo Romper</a>
</div>
<div class="attribute Size">
<span class="label">Size:</span>
<span class="value">
12M
</span>
</div>
<div class="verticalLine">|</div>
<div class="attribute Color">
<span class="label">Color:</span>
<span class="value">
Blue
</span>
</div>
<div class="minicartpricedisplay">
<div class="price">
<span class="MSRP price-standard">
<span class="msrp">MSRP*:
$18.00
</span>
</span>
<span class="price-standard desktopvisible">$7.20</span>
</div>
</div>
</div>
</div>
如果我尝试打印 'target_value_list',然后它打印一些内存地址的列表。 请有人救我。
我想打印什么是traget_value_list。想知道它是否包含html_src中的所有项目。
答
如果你想要得到的文本值,而不是webelements的名单列表,你可以尝试使用/text()
语法XPath
:使用,例如,//div/text()
或//div//text()
而不是仅仅//div
或使用text
财产或低于text_content()
方法:
target_value_list = [element.text_content() for element in page_tree.xpath(target_xpath)]
return target_value_list
你能告诉我们输出吗?你叫什么*一些内存地址*? – Andersson
[,,]这是print(target_value_list)的输出 – James
这是与'XPath'匹配的元素列表。您的代码正常工作。你可以更具体的*你想获得什么输出*?你想获得每个元素的文本内容吗?显示你使用的'XPath' – Andersson