Python中利用BeautifulSoup库进行简单的网页爬虫

本文章介绍的主要内容是在网页中搜寻到对应模块的参数值，以下将介绍利用BeautifulSoup来查询一个网站的访问量。

首先需要安装BeautifulSoup模块，我用的是Anaconda，已经附带安装了包括BeatifulSoup在内的第三方库。可以在.../Anaconda3/pkgs 文件夹中查看此模块的详细信息。

此外，需要对HTML语言有一定的了解。在此不做介绍。

以下是代码的实现：

----------------------

from bs4 import BeautifulSoup
from urllib import request
import re
res=request.urlopen("https://blog.****.net/qq_33810188")
soup=BeautifulSoup(res,"html.parser")
ullist=soup.findAll("div",attrs={"class":"grade-box clearfix"})
ullist1=soup.findAll("dd",attrs={"title":True})
ullist_rank=soup.findAll("dl",attrs={"title":True})
n=0
print(soup.title.string)
for index in ullist1:
    n=n+1
    uu=index.children
    for child in uu:
        if n==1:
            print("访问量：",child)
        if n==2:
            print("积分值：",child)
n=0
for index in ullist_rank:
    chil=index.children
    for child in chil:
        n=n+1
        if n==24:
            print("排名：",child.string)