如何使用JSOUP或Coldfusion从URL中删除查询字符串和散列值?
问题描述:
这里是例子:如何使用JSOUP或Coldfusion从URL中删除查询字符串和散列值?
当我解析一个HTML页面。我收到重复的URL值一样
- https://*.com/questions/tagged/java?sort=featured&pageSize=50
- https://*.com/questions/tagged/java#comments
- https://*.com/questions/tagged/java#comment212
如何避免这种重复上面的数值?
我只需要这个URL https://*.com/questions/tagged/java
答
我创建一个辅助方法processURL()
它接受一个URL,并返回包含一切高达任查询标记(?
)或井号(#
)的URL:
String processURL(String theURL) {
int endPos;
if (theURL.indexOf("?") > 0) {
endPos = theURL.indexOf("?");
} else if (theURL.indexOf("#") > 0) {
endPos = theURL.indexOf("#");
} else {
endPos = theURL.length();
}
return theURL.substring(0, endPos);
}
String urlOne = "http://*.com/questions/tagged/jav?#sort=featured&pageSize=50";
String urlTwo = "http://*.com/questions/tagged/java#comments";
System.out.println(processURL(urlOne));
System.out.println(processURL(urlTwo));
输出:
http://*.com/questions/tagged/java
http://*.com/questions/tagged/java