Java:如何从重定向的URL中读取内容?

问题描述:

我用下面的Java代码在一个Bean读取URL的内容:Java:如何从重定向的URL中读取内容?

String url; 
String inputLine; 
StringBuilder srcCode=new StringBuilder(); 

public void setUrl (String value) { 
    url = value; 
} 

private void scanWebPage() throws IOException { 
    try { 
     URL dest = new URL(url); 
     URLConnection yc = dest.openConnection(); 
     yc.setUseCaches(false); 
     BufferedReader in = new BufferedReader(new 
         InputStreamReader(yc.getInputStream())); 
     while ((inputLine = in.readLine()) != null) 
      srcCode = srcCode.append (inputLine); 
     in.close(); 
    } catch (FileNotFoundException fne) { 
     srcCode.append("File Not Found") ; 
    } 
} 

代码工作对大多数的URL,但对于重定向的URL不起作用。如何更新上述代码以从重定向的URL中读取内容?对于重定向的URL,我得到"File Not Found"

+0

'java.net.URL'应遵循默认重定向(除非您以前称为'HttpURLConnection.setFollowRedirects (false)'),所以你应该只看到最终目标URL的内容。假设重定向本身不会进入404页面... – 2013-02-27 11:27:59

+2

如果协议发生变化(即从HTTP到HTTPS),URL连接将不会遵循重定向。这是你的场景吗?另外,你是否被允许使用[Apache HttpComponents](http://hc.apache.org/)? – Perception 2013-02-27 11:28:01

给下面一展身手:

HttpURLConnection yc = (HttpURLConnection) dest.openConnection(); 
    yc.setInstanceFollowRedirects(true); 

在上下文上面代码:

`String url = "http://java.sun.com"; 
    String inputLine; 
    StringBuilder srcCode=new StringBuilder(); 



    URL dest = new URL(url); 
    HttpURLConnection yc = (HttpURLConnection) dest.openConnection(); 
    yc.setInstanceFollowRedirects(true); 
    yc.setUseCaches(false); 

    BufferedReader in = new BufferedReader(
     new InputStreamReader(
      yc.getInputStream())); 
    while ((inputLine = in.readLine()) != null) { 
     srcCode = srcCode.append (inputLine); 
    } 

    in.close();` 

进一步修饰,以帮助您诊断是怎么回事。此代码会关闭自动重定向,然后手动按照位置标题打印输出。

@Test 
public void f() throws IOException { 
    String url = "http://java.sun.com"; 


    fetchURL(url); 
} 


private HttpURLConnection fetchURL(String url) throws IOException { 
    URL dest = new URL(url); 
    HttpURLConnection yc = (HttpURLConnection) dest.openConnection(); 
    yc.setInstanceFollowRedirects(false); 
    yc.setUseCaches(false); 

    System.out.println("url = " + url); 

    int responseCode = yc.getResponseCode(); 
    if (responseCode >= 300 && responseCode < 400) { // brute force check, far too wide 
     return fetchURL(yc.getHeaderField("Location")); 
    } 

    System.out.println("yc.getResponseCode() = " + yc.getResponseCode()); 

    return yc; 
} 
+1

克里斯 - 谢谢,但这没有奏效。重定向的网址就像“迷你网址”一样,在网页浏览器中输入时会变为真实的网址,但通过Java代码,它们不会更改,并被称为无效网址。 – user1492667 2013-02-27 16:06:48

+0

你要去的URL是什么?我测试了上面的代码,发现它遵循上面url的重定向。你的情况是你的URL重定向到不同的协议?如果是这样,那么可能是你的问题,因为HttpURLConnection不会遵循这些。如果是这样的话,我会亲自使用一个库,如Play2中包含的库或Apache HttpCommons。或者,您可以随时将自动跟随设置为false,然后自己读出位置标题,然后自己显式获取该URL。 – 2013-02-28 09:01:08

它不是你的前卫的debuggin,但你可以看看这个例子

public class GetURLData 
{ 
    public static void main(String args[]) 
    { 
     String url = "the url you want the response from"; 
     HttpClient httpClient = new DefaultHttpClient(); 
      HttpPost httpPost = new HttpPost(url); 
      HttpResponse response; 
      StringBuilder builder= new StringBuilder(); 
      try 
      { 
       response = httpClient.execute(httpPost); 
       BufferedReader in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(), "UTF-8")); 
       char[] buf = new char[8000]; 
       int l = 0; 
        while (l >= 0) 
        { 
         builder.append(buf, 0, l); 
         l = in.read(buf); 
        } 
       System.out.println(builder.toString); 
      } catch (Exception e) 
     { 
       System.out.println("Exception is :"+e); 
       e.printStackTrace(); 
      } 
    } 
}