如何从cookie获取Web会话?
问题描述:
我试图做一个刮网页,但为了发布数据,我需要像如何从cookie获取Web会话?
web会话ID web_session = HQJ3G1GPAAHRZGFR
我怎样才能像ID?
到目前为止我的代码是:
Private Sub test()
Dim postData As String = "web_session=HQJ3G1GPAAHRZGFR&intext=O&term_code=201210&search_type=A&keyword=&kw_scope=all&kw_opt=all&subj_code=BIO&crse_numb=205&campus=*&instructor=*&instr_session=*&attr_type=*&mon=on&tue=on&wed=on&thu=on&fri=on&sat=on&sun=on&avail_flag=on" '/BANPROD/pkgyc_yccsweb.P_Results
Dim tempCookie As New CookieContainer
Dim encoding As New UTF8Encoding
Dim byteData As Byte() = encoding.GetBytes(postData)
System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
Try
tempCookie.GetCookies(New Uri("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"))
'postData="web_session=" & tempCookie.
Dim postReq As HttpWebRequest = DirectCast(WebRequest.Create("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"), HttpWebRequest)
postReq.Method = "POST"
postReq.KeepAlive = True
postReq.CookieContainer = tempCookie
postReq.ContentType = "application/x-www-form-urlencoded"
postReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.0.3705; Media Center PC 4.0; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
postReq.ContentLength = byteData.Length
Dim postreqstream As Stream = postReq.GetRequestStream
postreqstream.Write(byteData, 0, byteData.Length)
postreqstream.Close()
Dim postresponse As HttpWebResponse
postresponse = DirectCast(postReq.GetResponse, HttpWebResponse)
tempCookie.Add(postresponse.Cookies)
Dim postresreader As New StreamReader(postresponse.GetResponseStream)
Dim thepage As String = postresreader.ReadToEnd
MsgBox(thepage)
Catch ex As WebException
MsgBox(ex.Status.ToString & vbNewLine & ex.Message.ToString)
End Try
End Sub
答
的问题是,tempCookie.GetCookies()
没有做什么,你认为它做的事情。它实际上所做的实质上是将预先存在的CookieCollection
过滤为仅包含提供的URL的cookie。相反,你需要做的是首先创建一个请求到一个页面,这会给你这个会话令牌,然后对你的数据进行实际的请求。因此,首先请求P_Search
页面,然后重新使用该请求并将CookieContainer
绑定到该页面并发布到P_Results
。
但是,请让我指向WebClient
类和my post here about extending it to support cookies,而不是HttpWebRequest
对象。你会发现你可以简化你的代码。下面是一个完整的VB2010 WinForms应用程序,展示了这一点。如果你仍然想使用HttpWebRequest
的对象,这应该至少让你知道还需要做什么:
Option Strict On
Option Explicit On
Imports System.Net
Public Class Form1
Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
''//Create our webclient
Using WC As New CookieAwareWebClient()
''//Set SSLv3
System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
''//Create a session, ignore what is returned
WC.DownloadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Search")
''//POST our actual data and get the results
Dim S = WC.UploadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results", "POST", "term_code=201130&search_type=K&keyword=math")
Trace.WriteLine(S)
End Using
End Sub
End Class
Public Class CookieAwareWebClient
Inherits WebClient
Private cc As New CookieContainer()
Private lastPage As String
Protected Overrides Function GetWebRequest(ByVal address As System.Uri) As System.Net.WebRequest
Dim R = MyBase.GetWebRequest(address)
If TypeOf R Is HttpWebRequest Then
With DirectCast(R, HttpWebRequest)
.CookieContainer = cc
If Not lastPage Is Nothing Then
.Referer = lastPage
End If
End With
End If
lastPage = address.ToString()
Return R
End Function
End Class
这很棒。我试图弄清楚这一点,永远无法做到。谢谢你的帮助! – Jon49
我需要在并发环境中支持这种类型的功能。我知道WebClient不支持并发I/O,但是有没有办法为多个Web请求提供一个'CookieContainer',以便它们都使用单个会话?如果需要可以更多地解释逻辑。 – Terry
你可以使用'synclock'吗? http://stackoverflow.com/a/396248/231316 –