Python:Page Navigator Maximum Value Scrapper - 仅获取最后一个值的输出

Python:Page Navigator Maximum Value Scrapper - 仅获取最后一个值的输出

问题描述:

这是我创建的程序,用于从列表中的每个类别部分中提取最大页面值。我无法获取所有值,I我只是得到列表中最后一个值的值。为了获得所有输出,我需要做些什么改变。Python:Page Navigator Maximum Value Scrapper - 仅获取最后一个值的输出

import bs4 
from urllib.request import urlopen as uReq 
from bs4 import BeautifulSoup as soup 

#List for extended links to the base url 

links = ['Link_1/','Link_2/','Link_3/'] 
#Function to find out the biggest number present in the page navigation 
#section.Every element before 'Next→' is consist of the upper limit 

def page_no(): 
    bs = soup(page_html, "html.parser") 
    max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text 
    print(max_page) 

#url loop 
for url in links: 
    my_urls ='http://example.com/category/{}/'.format(url) 

# opening up connection,grabbing the page 
uClient = uReq(my_urls) 
page_html = uClient.read() 
uClient.close() 
page_no() 

页面导航实例: 1 2 3 … 15 Next →

由于提前

+0

请给真正的网址你解析 –

你需要把page_html在函数内部和缩进的最后4行。此外,最好返回max_page值,以便您可以使用它的功能。

def page_no(page_html): 
    bs = soup(page_html, "html.parser") 
    max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text 
    return max_page 

#url loop 
for url in links: 
    my_urls='http://example.com/category/{}/'.format(url) 
    # opening up connection,grabbing the page 
    uClient = uReq(my_urls) 
    page_html = uClient.read() 
    uClient.close() 
    max_page = page_no(page_html) 
    print(max_page)