xpath只从网站中提取一些数据

问题描述：

我正在使用xpath和python尝试从代码中的网站获取数据。我设法下载了大部分数据（在时尚之后），但我无法提取灰狗的数据字段和Dogdetail也出来相当奇怪。灰狗的数据实际上是一个标签href路径，并尝试了xpath的各种变化后，我仍然无法得到数据out.The总体计划是下载天狗赛车结果，成数据库（或电子表格）任何帮助表示赞赏。xpath只从网站中提取一些数据

from lxml import html 
import requests 


page = requests.get('http://www.gbgb.org.uk/resultsRace.aspx?id=1838526') 
tree = html.fromstring(page.content) 

track=tree.xpath('//div[@class="track"]/text() ') 
print 'Track',track 

date=tree.xpath('//div[@class="date"]/text() ') 
print 'date',date 

datetime=tree.xpath('//div[@class="datetime"]/text() ') 
print 'datetime', datetime 

essentialgreyhound=tree.xpath('//a[@href="essential greyhound"]/text() ') 
print 'Greyhound', essentialgreyhound 

firstessentialfin= tree.xpath('//li[@class="first essential fin"]//text()') 
print 'Position:', firstessentialfin 
sp= tree.xpath('//li[@class="sp"]/text() ') 
print 'StartingPrice:', sp 
trap= tree.xpath('//li[@class="trap"]/text() ') 
print 'Trap:', trap 
trainer= tree.xpath('//li[@class="essential trainer"]/text() ') 
print 'Trainer:', trainer 
timeSec=tree.xpath('//li[@class="timeSec"]/text() ') 
print 'TimeSec',timeSec 
timeDistance=tree.xpath('//li[@class="timeDistance"]/text() ') 
print 'TimeDistance',timeDistance 

firstessentialcomment=tree.xpath('//li[@class="first essential comment"]/text() ') 
print 'Comment',firstessentialcomment 
firstessential=tree.xpath('//li[@class="first essential"]/text()') 
print 'DogDetail', firstessential

你为什么标记它作为Python3？打印提示这是Python2 .. – alecxe

嗨alecxe，我的标签是P3和P2，我也试图做一个类似的事情BS.Many谢谢你的答复。 – moonshadow

答

您应该解决您的Greyhound列的XPath：

//li[@class="essential greyhound"]/a/text()

打印对我来说：

Greyhound ['Ultimate Bundle', 'Powerfast Raven', 'Upagumtree', 'Buglys Causeway', 'Group Vespa', 'Winword Jacko']

嗨alecxe，非常感谢您的回复，它完美的作品。） – moonshadow

xpath只从网站中提取一些数据

相关推荐