使用BeautifulSoup进行网页抓取:正在获取fundsupermart数据

问题描述:

我正在使用beautifulsoup从fundsupermart中获取推荐的资金日期https://www.fundsupermart.co.in/main/research/recommendedFundsNew.svdo?但我无法获得基金的课程和其他属性。使用BeautifulSoup进行网页抓取:正在获取fundsupermart数据

当我使用SELECT语句
soup.select(“ table_bdrow1_style”)

我没有得到的类和基金的其他属性。我只是获得资金的名字。谁能帮我这个 ?

+0

显示您的代码。 'select()'不能做所有事情 - 你需要更多的代码来获取数据。 – furas

import requests 
from bs4 import BeautifulSoup 

r = requests.get('https://www.fundsupermart.co.in/main/research/recommendedFundsNew.svdo') 
soup = BeautifulSoup(r.text, 'lxml') 
trs = soup.find_all(class_="table_bdrow1_style") 

for tr in trs: 
    row = list(tr.stripped_strings) 
    print(row) 

出来:

['BSL FRONTLINE EQUITY FUND- GROWTH', 'Large Cap', '9.69', '3.87', '17.2', '17.01'] 
['ICICI PRUDENTIAL FOCUSED BLUECHIP EQUITY FUND- GROWTH', 'Large Cap', '9.11', '3.05', '15.34', '14.97'] 
['SBI BLUE CHIP FUND- GROWTH', 'Large Cap', '7.82', '6.49', '19.69', '18.73'] 
['AXIS EQUITY FUND- GROWTH', 'Large Cap', '1.08', '-1.44', '11.86', '14.31'] 
['BNP PARIBAS EQUITY FUND- GROWTH', 'Large Cap', '0.34', '0.89', '14.83', '14.96'] 
['RELIANCE TOP 200 FUND- GROWTH', 'Large Cap', '5.66', '2.21', '19.39', '17.18'] 

hrml.parse版本:

import requests 
from bs4 import BeautifulSoup 

r = requests.get('https://www.fundsupermart.co.in/main/research/recommendedFundsNew.svdo') 
soup = BeautifulSoup(r.content, 'html.parser') 
trs = soup.find_all(class_="table_bdrow1_style") 

for tr in trs: 
    row = list(tr.stripped_strings) 
    print(row) 

出来:

['BSL FRONTLINE EQUITY FUND- GROWTH'] 
['ICICI PRUDENTIAL FOCUSED BLUECHIP EQUITY FUND- GROWTH'] 
['SBI BLUE CHIP FUND- GROWTH'] 
['AXIS EQUITY FUND- GROWTH'] 
['BNP PARIBAS EQUITY FUND- GROWTH'] 
['RELIANCE TOP 200 FUND- GROWTH'] 

,如果你仍然得到错误,请张贴您的代码和错误信息。

+0

r = requests.get('https://www.fundsupermart.co.in/main/research/recommendedFundsNew.svdo') soup = BeautifulSoup(r.content,'html.parser') trs = soup.find_all (class _ =“table_bdrow1_style”) trs 为什么不能使用此代码?为什么html分析器不起作用? – nitinvijay23

+0

使用html解析器,您无法获得仅通过lxml获得的所需结果?为什么这样? – nitinvijay23

+0

我只是注意到,我检查这两个解析器的源代码html代码,他们都是一样的,所以我也很困惑。 –