xpath只从网站中提取一些数据

xpath只从网站中提取一些数据

问题描述:

我正在使用xpath和python尝试从代码中的网站获取数据。我设法下载了大部分数据(在时尚之后),但我无法提取灰狗的数据字段和Dogdetail也出来相当奇怪。灰狗的数据实际上是一个标签href路径,并尝试了xpath的各种变化后,我仍然无法得到数据out.The总体计划是下载天狗赛车结果,成数据库(或电子表格)任何帮助表示赞赏。xpath只从网站中提取一些数据

from lxml import html 
import requests 


page = requests.get('http://www.gbgb.org.uk/resultsRace.aspx?id=1838526') 
tree = html.fromstring(page.content) 

track=tree.xpath('//div[@class="track"]/text() ') 
print 'Track',track 

date=tree.xpath('//div[@class="date"]/text() ') 
print 'date',date 

datetime=tree.xpath('//div[@class="datetime"]/text() ') 
print 'datetime', datetime 

essentialgreyhound=tree.xpath('//a[@href="essential greyhound"]/text() ') 
print 'Greyhound', essentialgreyhound 

firstessentialfin= tree.xpath('//li[@class="first essential fin"]//text()') 
print 'Position:', firstessentialfin 
sp= tree.xpath('//li[@class="sp"]/text() ') 
print 'StartingPrice:', sp 
trap= tree.xpath('//li[@class="trap"]/text() ') 
print 'Trap:', trap 
trainer= tree.xpath('//li[@class="essential trainer"]/text() ') 
print 'Trainer:', trainer 
timeSec=tree.xpath('//li[@class="timeSec"]/text() ') 
print 'TimeSec',timeSec 
timeDistance=tree.xpath('//li[@class="timeDistance"]/text() ') 
print 'TimeDistance',timeDistance 

firstessentialcomment=tree.xpath('//li[@class="first essential comment"]/text() ') 
print 'Comment',firstessentialcomment 
firstessential=tree.xpath('//li[@class="first essential"]/text()') 
print 'DogDetail', firstessential 
+0

你为什么标记它作为Python3?打印提示这是Python2 .. – alecxe

+0

嗨alecxe,我的标签是P3和P2,我也试图做一个类似的事情BS.Many谢谢你的答复。 – moonshadow

您应该解决您的Greyhound列的XPath:

//li[@class="essential greyhound"]/a/text() 

打印对我来说:

Greyhound ['Ultimate Bundle', 'Powerfast Raven', 'Upagumtree', 'Buglys Causeway', 'Group Vespa', 'Winword Jacko'] 
+0

嗨alecxe,非常感谢您的回复,它完美的作品。) – moonshadow