python爬虫：HTTP基本接入认证（HTTP basic access authentication）

今天在网络数据采集时，看到一个很有意思的网络爬虫，以前写到过许多爬虫都没有遇到过。做个记录写下来吧

HTTP基本接入认证：

在发明cookie之前，处理网站登录最常用的方法就是用HTTP基本接入认证，例如用来测试网站：http://pythonscraping.com/pages/auth/login.php

直接打开网页如图：

当输入用户名和密码（随便设置）时如图。

python爬虫：HTTP基本接入认证（HTTP basic access authentication）

之后抓了个包看看。

python爬虫：HTTP基本接入认证（HTTP basic access authentication）

HTTP basic access authentication 过程可以去网络上查查，这里推荐个大牛：https://juejin.im/entry/5ac175baf265da239e4e3999

下面是爬虫部分。

直接请求返回的是401。

python爬虫：HTTP基本接入认证（HTTP basic access authentication）

加入认证信息：

python爬虫：HTTP基本接入认证（HTTP basic access authentication）

代码：

# -*- coding: utf8 -*-

"""
# __author__ = "Tom枫明"
# HomePage: https://blog.****.net/fm345689
# Python -V: Python 3.6.1 
"""

"""
http基本接入认证
"""


import requests

from requests.auth import AuthBase
from requests.auth import HTTPBasicAuth

auth = HTTPBasicAuth("tom枫明".encode('utf-8'),"123456")

resopnse = requests.post("http://pythonscraping.com/pages/auth/login.php",auth=auth)
pass

python爬虫：HTTP基本接入认证（HTTP basic access authentication）

相关推荐