1、要写一个微博爬虫,得分开几个模块来做:
(1)模拟登录
(2)模拟浏览
(3)针对短时间内大量访问而引起怀疑的禁止登陆解决方案
(4)其他
(1)模拟登陆模块
前提:要模拟登录,得首先知道在登录微博的时候,每一次的HTTP请求中都包含了什么信息,于是,可以利用fiddler结合浏览器(chrome除外)来观察每个请求包信息
过程:
(一)在浏览器输入:weibo.com,截获如下操作:
可以看到这个过程中,有一次尤为重要的HTTP请求:GET /sso/prelogin.php
也就是登陆前的预处理,名字也显而易见--pre,这一步操作主要是拿到了几个重要的参数,用作下一步POST表单的参数。
可以在浏览器中访问这一请求的地址:http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=&rsakt=mod&client=ssologin.js(v1.4.11)&_=1390381358228
查看返回的如下(一共有7个参数):
sinaSSOController.preloginCallBack ({
1 "retcode":0,
2 "servertime":1390393012,
3 "pcid":"xd-7b601057c7fdb6c0cd2308ab66f824cf655d",
4 "nonce":"YNJDUH",
5 "pubkey":"EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443",
6 "rsakv":"1330428213",
7 "exectime":1
})
其中,
1.
2. servertime:服务时间戳,
3.
4. nonce:是一个6位随机码,
5. pubkey:是用于rsa2密码加密的公钥,
6. rsakv:显然也是加密用的。
7.
2,4,5,6 四个值都是需要在下一步用到的。
在旧版本的新浪微博登陆机制当中,使用的是sha的加密方式,没有pubkey和rsakv参数,
因此,在网上看到的2012年的爬虫代码模拟登陆的代码都已经不适合用,但总体逻辑是没多大变化的,只需要稍作修改还是能用的,我就是这么走过来的,碰壁无数=.=
给出参考文章:
原文参考:http://www.th7.cn/web/js/201308/12198.shtml
For Python: http://www.douban.com/note/201767245/
For PHP: http://www.2cto.com/kf/201210/159591.html
(二)在登录页面输入登录信息(用户名和密码后)截获如下操作:
通过查看其中第一个HTTP请求:POST /sso/login.php?client=ssologin.js(v1.4.11)
看到了,有几个重要的参数包括:
1. su:加密后的用户名
2. servertime
3. nonce
4. pwencode:pwd加密方式
5. nonce
6. rsakv
7. sp:加密后的密码
现在问题就是用户名(su)和密码(sp)是怎么加密的了,
很容易地我们找到了pwencode,知道密码使用rsa2加密的
而通过查看微博登录页面 http://login.sina.com.cn/signup/signin.php:
的目光聚焦到js代码上面
我们发现了里面有用于密码加密的js代码位于 http://login.sina.com.cn/js/sso/ssologin.js,
里面的代码是加密过的,找个工具解密一下(http://www.cnlabs.net/tools/Js_Decoder/),
由上文知道,加密的时候使用了 servertime和nonce,那么我们可以找到:
其中发现与su,sp和servertime,nonce有关的 js代码:
var makeRequest = function (username, password, savestate)
{
var request =
{
entry : me.getEntry(), gateway : 1, from : me.from, savestate : savestate, useticket : me.useTicket ? 1 : 0
};
if (me.failRedirect) {
me.loginExtraQuery.frd = 1
}
request = objMerge(request, {
pagerefer : document.referrer || ""
});
request = objMerge(request, me.loginExtraFlag);
request = objMerge(request, me.loginExtraQuery);
request.su = sinaSSOEncoder.base64.encode(urlencode(username));
if (me.service) {
request.service = me.service
}
if ((me.loginType & rsa) && me.servertime && sinaSSOEncoder && sinaSSOEncoder.RSAKey)
{
request.servertime = me.servertime;
request.nonce = me.nonce;
request.pwencode = "rsa2";
request.rsakv = me.rsakv;
var RSAKey = new sinaSSOEncoder.RSAKey();
RSAKey.setPublic(me.rsaPubkey, "10001");
password = RSAKey.encrypt([me.servertime, me.nonce].join(" ") + " " + password)
}
else
{
if ((me.loginType & wsse) && me.servertime && sinaSSOEncoder && sinaSSOEncoder.hex_sha1)
{
request.servertime = me.servertime;
request.nonce = me.nonce;
request.pwencode = "wsse";
password = sinaSSOEncoder.hex_sha1("" + sinaSSOEncoder.hex_sha1(sinaSSOEncoder.hex_sha1(password)) + me.servertime + me.nonce);
}
}
request.sp = password;
try {
request.sr = window.screen.width + "*" + window.screen.height
}
catch (e) {}
return request;
};
可以看到su使用的是base64加密方式;
通过寻找sinaSSOEncoder.base64.encode函数,可以看到:
this.base64 = { encode : function (l) { l = "" + l; if (l == "") { return "" } var j = ""; var s, q, o = ""; var r, p, n, m = ""; var k = 0; do { s = l.charCodeAt(k++); q = l.charCodeAt(k++); o = l.charCodeAt(k++); r = s >> 2; p = ((s & 3) << 4) | (q >> 4); n = ((q & 15) << 2) | (o >> 6); m = o & 63; if (isNaN(q)) { n = m = 64 } else { if (isNaN(o)) { m = 64; } } j = j + this._keys.charAt(r) + this._keys.charAt(p) + this._keys.charAt(n) + this._keys.charAt(m); s = q = o = ""; r = p = n = m = "" } while (k < l.length); return j; },
decode : function (s, p, l)
{
...
}
_keys : "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=",
_keys_urlsafe : "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_="
};
而sp就稍微复杂一点了,
if中的是新浪当前版本的密码加密方式rsa2的代码,而else中的是就版本sha加密的代码,
我们只需要关心if中的内容, 加密过程很简单:
var RSAKey = new sinaSSOEncoder.RSAKey();
RSAKey.setPublic(me.rsaPubkey, "10001");
password = RSAKey.encrypt([me.servertime, me.nonce].join(" ") + " " + password)
首先利用sinaSSOEncoder生成一个RSAKey,好,那么sinaSSOEncoder.RSAKey是什么呢?通过查看,发现在sinaSSOEncoder中有代码:
1 var sinaSSOEncoder = sinaSSOEncoder || {}; 2 (function () 3 { 4 var av; 5 var ah = 244837814094590; 6 var Y = ((ah & 16777215) == 15715070); 7 function aq(z, t, az) 8 { 9 if (z != null) 10 { 11 if ("number" == typeof z) { 12 this.fromNumber(z, t, az) 13 } 14 else { 15 if (t == null && "string" != typeof z) { 16 this.fromString(z, 256) 17 } 18 else { 19 this.fromString(z, t) 20 } 21 } 22 } 23 } 24 function h() 25 { 26 return new aq(null) 27 } 28 function b(aB, t, z, aA, aD, aC) 29 { 30 while (--aC >= 0) { 31 var az = t * this [aB++] + z[aA] + aD; 32 aD = Math.floor(az / 67108864); 33 z[aA++] = az & 67108863 34 } 35 return aD 36 } 37 function ax(aB, aG, aH, aA, aE, t) 38 { 39 var aD = aG & 32767, aF = aG >> 15; 40 while (--t >= 0) 41 { 42 var az = this [aB] & 32767; 43 var aC = this [aB++] >> 15; 44 var z = aF * az + aC * aD; 45 az = aD * az + ((z & 32767) << 15) + aH[aA] + (aE & 1073741823); 46 aE = (az >>> 30) + (z >>> 15) + aF * aC + (aE >>> 30); 47 aH[aA++] = az & 1073741823 48 } 49 return aE 50 } 51 function aw(aB, aG, aH, aA, aE, t) 52 { 53 var aD = aG & 16383, aF = aG >> 14; 54 while (--t >= 0) 55 { 56 var az = this [aB] & 16383; 57 var aC = this [aB++] >> 14; 58 var z = aF * az + aC * aD; 59 az = aD * az + ((z & 16383) << 14) + aH[aA] + aE; 60 aE = (az >> 28) + (z >> 14) + aF * aC; 61 aH[aA++] = az & 268435455 62 } 63 return aE 64 } 65 if (Y && (navigator.appName == "Microsoft Internet Explorer")) { 66 aq.prototype.am = ax; 67 av = 30 68 } 69 else 70 { 71 if (Y && (navigator.appName != "Netscape")) { 72 aq.prototype.am = b; 73 av = 26 74 } 75 else { 76 aq.prototype.am = aw; 77 av = 28; 78 } 79 } 80 aq.prototype.DB = av; 81 aq.prototype.DM = ((1 << av) - 1); 82 aq.prototype.DV = (1 << av); 83 var Z = 52; 84 aq.prototype.FV = Math.pow(2, Z); 85 aq.prototype.F1 = Z - av; 86 aq.prototype.F2 = 2 * av - Z; 87 var ad = "0123456789abcdefghijklmnopqrstuvwxyz"; 88 var af = new Array(); 89 var ao, v; 90 ao = "0".charCodeAt(0); 91 for (v = 0; v <= 9; ++v) { 92 af[ao++] = v 93 } 94 ao = "a".charCodeAt(0); 95 for (v = 10; v < 36; ++v) { 96 af[ao++] = v 97 } 98 ao = "A".charCodeAt(0); 99 for (v = 10; v < 36; ++v) { 100 af[ao++] = v 101 } 102 function ay(t) 103 { 104 return ad.charAt(t) 105 } 106 function A(z, t) 107 { 108 var az = af[z.charCodeAt(t)]; 109 return (az == null) ?- 1 : az 110 } 111 function X(z) 112 { 113 for (var t = this.t - 1; t >= 0; --t) { 114 z[t] = this [t] 115 } 116 z.t = this.t; 117 z.s = this.s 118 } 119 function n(t) 120 { 121 this.t = 1; 122 this.s = (t < 0) ?- 1 : 0; 123 if (t > 0) { 124 this [0] = t 125 } 126 else { 127 if (t <- 1) { 128 this [0] = t + DV 129 } 130 else { 131 this.t = 0; 132 } 133 } 134 } 135 function c(t) 136 { 137 var z = h(); 138 z.fromInt(t); 139 return z 140 } 141 function w(aD, z) 142 { 143 var aA; 144 if (z == 16) { 145 aA = 4 146 } 147 else 148 { 149 if (z == 8) { 150 aA = 3 151 } 152 else 153 { 154 if (z == 256) { 155 aA = 8 156 } 157 else 158 { 159 if (z == 2) { 160 aA = 1 161 } 162 else { 163 if (z == 32) { 164 aA = 5 165 } 166 else { 167 if (z == 4) { 168 aA = 2 169 } 170 else { 171 this.fromRadix(aD, z); 172 return 173 } 174 } 175 } 176 } 177 } 178 } 179 this.t = 0; 180 this.s = 0; 181 var aC = aD.length, az = false, aB = 0; 182 while (--aC >= 0) 183 { 184 var t = (aA == 8) ? aD[aC] & 255 : A(aD, aC); 185 if (t < 0) { 186 if (aD.charAt(aC) == "-") { 187 az = true 188 } 189 continue 190 } 191 az = false; 192 if (aB == 0) { 193 this [this.t++] = t 194 } 195 else 196 { 197 if (aB + aA > this.DB) { 198 this [this.t - 1] |= (t & ((1 << (this.DB - aB)) - 1)) << aB; 199 this [this.t++] = (t >> (this.DB - aB)) 200 } 201 else { 202 this [this.t - 1] |= t << aB 203 } 204 } 205 aB += aA; 206 if (aB >= this.DB) { 207 aB -= this.DB 208 } 209 } 210 if (aA == 8 && (aD[0] & 128) != 0) { 211 this.s =- 1; 212 if (aB > 0) { 213 this [this.t - 1] |= ((1 << (this.DB - aB)) - 1) << aB 214 } 215 } 216 this.clamp(); 217 if (az) { 218 aq.ZERO.subTo(this, this) 219 } 220 } 221 function O() 222 { 223 var t = this.s & this.DM; 224 while (this.t > 0 && this [this.t - 1] == t) { 225 --this.t 226 } 227 } 228 function q(z) 229 { 230 if (this.s < 0) { 231 return "-" + this.negate().toString(z) 232 } 233 var az; 234 if (z == 16) { 235 az = 4 236 } 237 else 238 { 239 if (z == 8) { 240 az = 3 241 } 242 else 243 { 244 if (z == 2) { 245 az = 1 246 } 247 else { 248 if (z == 32) { 249 az = 5 250 } 251 else { 252 if (z == 4) { 253 az = 2 254 } 255 else { 256 return this.toRadix(z); 257 } 258 } 259 } 260 } 261 } 262 var aB = (1 << az) - 1, aE, t = false, aC = "", aA = this.t; 263 var aD = this.DB - (aA * this.DB) % az; 264 if (aA--> 0) 265 { 266 if (aD < this.DB && (aE = this [aA] >> aD) > 0) { 267 t = true; 268 aC = ay(aE) 269 } 270 while (aA >= 0) 271 { 272 if (aD < az) { 273 aE = (this [aA] & ((1 << aD) - 1)) << (az - aD); 274 aE |= this [--aA] >> (aD += this.DB - az) 275 } 276 else { 277 aE = (this [aA] >> (aD -= az)) & aB; 278 if (aD <= 0) { 279 aD += this.DB; 280 --aA 281 } 282 } 283 if (aE > 0) { 284 t = true 285 } 286 if (t) { 287 aC += ay(aE) 288 } 289 } 290 } 291 return t ? aC : "0" 292 } 293 function R() 294 { 295 var t = h(); 296 aq.ZERO.subTo(this, t); 297 return t 298 } 299 function ak() 300 { 301 return (this.s < 0) ? this.negate() : this 302 } 303 function G(t) 304 { 305 var az = this.s - t.s; 306 if (az != 0) { 307 return az 308 } 309 var z = this.t; 310 az = z - t.t; 311 if (az != 0) { 312 return az 313 } 314 while (--z >= 0) { 315 if ((az = this [z] - t[z]) != 0) { 316 return az; 317 } 318 } 319 return 0 320 } 321 function j(z) 322 { 323 var aA = 1, az; 324 if ((az = z >>> 16) != 0) { 325 z = az; 326 aA += 16 327 } 328 if ((az = z >> 8) != 0) { 329 z = az; 330 aA += 8 331 } 332 if ((az = z >> 4) != 0) { 333 z = az; 334 aA += 4 335 } 336 if ((az = z >> 2) != 0) { 337 z = az; 338 aA += 2 339 } 340 if ((az = z >> 1) != 0) { 341 z = az; 342 aA += 1 343 } 344 return aA 345 } 346 function u() 347 { 348 if (this.t <= 0) { 349 return 0 350 } 351 return this.DB * (this.t - 1) + j(this [this.t - 1]^(this.s & this.DM)) 352 } 353 function ap(az, z) 354 { 355 var t; 356 for (t = this.t - 1; t >= 0; --t) { 357 z[t + az] = this [t] 358 } 359 for (t = az - 1; t >= 0; --t) { 360 z[t] = 0 361 } 362 z.t = this.t + az; 363 z.s = this.s 364 } 365 function W(az, z) 366 { 367 for (var t = az; t < this.t; ++t) { 368 z[t - az] = this [t] 369 } 370 z.t = Math.max(this.t - az, 0); 371 z.s = this.s 372 } 373 function s(aE, aA) 374 { 375 var z = aE % this.DB; 376 var t = this.DB - z; 377 var aC = (1 << t) - 1; 378 var aB = Math.floor(aE / this.DB), aD = (this.s << z) & this.DM, az; 379 for (az = this.t - 1; az >= 0; --az) { 380 aA[az + aB + 1] = (this [az] >> t) | aD; 381 aD = (this [az] & aC) << z 382 } 383 for (az = aB - 1; az >= 0; --az) { 384 aA[az] = 0 385 } 386 aA[aB] = aD; 387 aA.t = this.t + aB + 1; 388 aA.s = this.s; 389 aA.clamp() 390 } 391 function l(aD, aA) 392 { 393 aA.s = this.s; 394 var aB = Math.floor(aD / this.DB); 395 if (aB >= this.t) { 396 aA.t = 0; 397 return 398 } 399 var z = aD % this.DB; 400 var t = this.DB - z; 401 var aC = (1 << z) - 1; 402 aA[0] = this [aB] >> z; 403 for (var az = aB + 1; az < this.t; ++az) { 404 aA[az - aB - 1] |= (this [az] & aC) << t; 405 aA[az - aB] = this [az] >> z 406 } 407 if (z > 0) { 408 aA[this.t - aB - 1] |= (this.s & aC) << t 409 } 410 aA.t = this.t - aB; 411 aA.clamp() 412 } 413 function aa(z, aA) 414 { 415 var az = 0, aB = 0, t = Math.min(z.t, this.t); 416 while (az < t) { 417 aB += this [az] - z[az]; 418 aA[az++] = aB & this.DM; 419 aB >>= this.DB 420 } 421 if (z.t < this.t) { 422 aB -= z.s; 423 while (az < this.t) { 424 aB += this [az]; 425 aA[az++] = aB & this.DM; 426 aB >>= this.DB 427 } 428 aB += this.s 429 } 430 else { 431 aB += this.s; 432 while (az < z.t) { 433 aB -= z[az]; 434 aA[az++] = aB & this.DM; 435 aB >>= this.DB 436 } 437 aB -= z.s 438 } 439 aA.s = (aB < 0) ?- 1 : 0; 440 if (aB <- 1) { 441 aA[az++] = this.DV + aB 442 } 443 else { 444 if (aB > 0) { 445 aA[az++] = aB; 446 } 447 } 448 aA.t = az; 449 aA.clamp() 450 } 451 function D(z, aA) 452 { 453 var t = this.abs(), aB = z.abs(); 454 var az = t.t; 455 aA.t = az + aB.t; 456 while (--az >= 0) { 457 aA[az] = 0 458 } 459 for (az = 0; az < aB.t; ++az) { 460 aA[az + t.t] = t.am(0, aB[az], aA, az, 0, t.t) 461 } 462 aA.s = 0; 463 aA.clamp(); 464 if (this.s != z.s) { 465 aq.ZERO.subTo(aA, aA) 466 } 467 } 468 function Q(az) 469 { 470 var t = this.abs(); 471 var z = az.t = 2 * t.t; 472 while (--z >= 0) { 473 az[z] = 0 474 } 475 for (z = 0; z < t.t - 1; ++z) 476 { 477 var aA = t.am(z, t[z], az, 2 * z, 0, 1); 478 if ((az[z + t.t] += t.am(z + 1, 2 * t[z], az, 2 * z + 1, aA, t.t - z - 1)) >= t.DV) { 479 az[z + t.t] -= t.DV; 480 az[z + t.t + 1] = 1; 481 } 482 } 483 if (az.t > 0) { 484 az[az.t - 1] += t.am(z, t[z], az, 2 * z, 0, 1) 485 } 486 az.s = 0; 487 az.clamp() 488 } 489 function E(aH, aE, aD) 490 { 491 var aN = aH.abs(); 492 if (aN.t <= 0) { 493 return 494 } 495 var aF = this.abs(); 496 if (aF.t < aN.t) { 497 if (aE != null) { 498 aE.fromInt(0) 499 } 500 if (aD != null) { 501 this.copyTo(aD) 502 } 503 return 504 } 505 if (aD == null) { 506 aD = h() 507 } 508 var aB = h(), z = this.s, aG = aH.s; 509 var aM = this.DB - j(aN[aN.t - 1]); 510 if (aM > 0) { 511 aN.lShiftTo(aM, aB); 512 aF.lShiftTo(aM, aD) 513 } 514 else { 515 aN.copyTo(aB); 516 aF.copyTo(aD) 517 } 518 var aJ = aB.t; 519 var az = aB[aJ - 1]; 520 if (az == 0) { 521 return 522 } 523 var aI = az * (1 << this.F1) + ((aJ > 1) ? aB[aJ - 2] >> this.F2 : 0); 524 var aQ = this.FV / aI, aP = (1 << this.F1) / aI, aO = 1 << this.F2; 525 var aL = aD.t, aK = aL - aJ, aC = (aE == null) ? h() : aE; 526 aB.dlShiftTo(aK, aC); 527 if (aD.compareTo(aC) >= 0) { 528 aD[aD.t++] = 1; 529 aD.subTo(aC, aD) 530 } 531 aq.ONE.dlShiftTo(aJ, aC); 532 aC.subTo(aB, aB); 533 while (aB.t < aJ) { 534 aB[aB.t++] = 0 535 } 536 while (--aK >= 0) 537 { 538 var aA = (aD[--aL] == az) ? this.DM : Math.floor(aD[aL] * aQ + (aD[aL - 1] + aO) * aP); 539 if ((aD[aL] += aB.am(0, aA, aD, aK, 0, aJ)) < aA) { 540 aB.dlShiftTo(aK, aC); 541 aD.subTo(aC, aD); 542 while (aD[aL] <--aA) { 543 aD.subTo(aC, aD) 544 } 545 } 546 } 547 if (aE != null) { 548 aD.drShiftTo(aJ, aE); 549 if (z != aG) { 550 aq.ZERO.subTo(aE, aE) 551 } 552 } 553 aD.t = aJ; 554 aD.clamp(); 555 if (aM > 0) { 556 aD.rShiftTo(aM, aD) 557 } 558 if (z < 0) { 559 aq.ZERO.subTo(aD, aD) 560 } 561 } 562 function N(t) 563 { 564 var z = h(); 565 this.abs().divRemTo(t, null, z); 566 if (this.s < 0 && z.compareTo(aq.ZERO) > 0) { 567 t.subTo(z, z) 568 } 569 return z 570 } 571 function K(t) 572 { 573 this.m = t 574 } 575 function U(t) 576 { 577 if (t.s < 0 || t.compareTo(this.m) >= 0) { 578 return t.mod(this.m) 579 } 580 else { 581 return t; 582 } 583 } 584 function aj(t) 585 { 586 return t 587 } 588 function J(t) 589 { 590 t.divRemTo(this.m, null, t) 591 } 592 function H(t, az, z) 593 { 594 t.multiplyTo(az, z); 595 this.reduce(z) 596 } 597 function at(t, z) 598 { 599 t.squareTo(z); 600 this.reduce(z) 601 } 602 K.prototype.convert = U; 603 K.prototype.revert = aj; 604 K.prototype.reduce = J; 605 K.prototype.mulTo = H; 606 K.prototype.sqrTo = at; 607 function B() 608 { 609 if (this.t < 1) { 610 return 0 611 } 612 var t = this [0]; 613 if ((t & 1) == 0) { 614 return 0 615 } 616 var z = t & 3; 617 z = (z * (2 - (t & 15) * z)) & 15; 618 z = (z * (2 - (t & 255) * z)) & 255; 619 z = (z * (2 - (((t & 65535) * z) & 65535))) & 65535; 620 z = (z * (2 - t * z % this.DV)) % this.DV; 621 return (z > 0) ? this.DV - z :- z 622 } 623 function f(t) 624 { 625 this.m = t; 626 this.mp = t.invDigit(); 627 this.mpl = this.mp & 32767; 628 this.mph = this.mp >> 15; 629 this.um = (1 << (t.DB - 15)) - 1; 630 this.mt2 = 2 * t.t 631 } 632 function ai(t) 633 { 634 var z = h(); 635 t.abs().dlShiftTo(this.m.t, z); 636 z.divRemTo(this.m, null, z); 637 if (t.s < 0 && z.compareTo(aq.ZERO) > 0) { 638 this.m.subTo(z, z) 639 } 640 return z 641 } 642 function ar(t) 643 { 644 var z = h(); 645 t.copyTo(z); 646 this.reduce(z); 647 return z 648 } 649 function P(t) 650 { 651 while (t.t <= this.mt2) { 652 t[t.t++] = 0 653 } 654 for (var az = 0; az < this.m.t; ++az) 655 { 656 var z = t[az] & 32767; 657 var aA = (z * this.mpl + (((z * this.mph + (t[az] >> 15) * this.mpl) & this.um) << 15)) & t.DM; 658 z = az + this.m.t; 659 t[z] += this.m.am(0, aA, t, az, 0, this.m.t); 660 while (t[z] >= t.DV) { 661 t[z] -= t.DV; 662 t[++z]++ 663 } 664 } 665 t.clamp(); 666 t.drShiftTo(this.m.t, t); 667 if (t.compareTo(this.m) >= 0) { 668 t.subTo(this.m, t) 669 } 670 } 671 function al(t, z) 672 { 673 t.squareTo(z); 674 this.reduce(z) 675 } 676 function y(t, az, z) 677 { 678 t.multiplyTo(az, z); 679 this.reduce(z) 680 } 681 f.prototype.convert = ai; 682 f.prototype.revert = ar; 683 f.prototype.reduce = P; 684 f.prototype.mulTo = y; 685 f.prototype.sqrTo = al; 686 function i() 687 { 688 return ((this.t > 0) ? (this [0] & 1) : this.s) == 0 689 } 690 function x(aE, aF) 691 { 692 if (aE > 4294967295 || aE < 1) { 693 return aq.ONE 694 } 695 var aD = h(), az = h(), aC = aF.convert(this), aB = j(aE) - 1; 696 aC.copyTo(aD); 697 while (--aB >= 0) { 698 aF.sqrTo(aD, az); 699 if ((aE & (1 << aB)) > 0) { 700 aF.mulTo(az, aC, aD) 701 } 702 else { 703 var aA = aD; 704 aD = az; 705 az = aA; 706 } 707 } 708 return aF.revert(aD) 709 } 710 function am(az, t) 711 { 712 var aA; 713 if (az < 256 || t.isEven()) { 714 aA = new K(t) 715 } 716 else { 717 aA = new f(t) 718 } 719 return this.exp(az, aA) 720 } 721 aq.prototype.copyTo = X; 722 aq.prototype.fromInt = n; 723 aq.prototype.fromString = w; 724 aq.prototype.clamp = O; 725 aq.prototype.dlShiftTo = ap; 726 aq.prototype.drShiftTo = W; 727 aq.prototype.lShiftTo = s; 728 aq.prototype.rShiftTo = l; 729 aq.prototype.subTo = aa; 730 aq.prototype.multiplyTo = D; 731 aq.prototype.squareTo = Q; 732 aq.prototype.divRemTo = E; 733 aq.prototype.invDigit = B; 734 aq.prototype.isEven = i; 735 aq.prototype.exp = x; 736 aq.prototype.toString = q; 737 aq.prototype.negate = R; 738 aq.prototype.abs = ak; 739 aq.prototype.compareTo = G; 740 aq.prototype.bitLength = u; 741 aq.prototype.mod = N; 742 aq.prototype.modPowInt = am; 743 aq.ZERO = c(0); 744 aq.ONE = c(1); 745 function k() 746 { 747 this.i = 0; 748 this.j = 0; 749 this.S = new Array() 750 } 751 function e(aB) 752 { 753 var aA, z, az; 754 for (aA = 0; aA < 256; ++aA) { 755 this.S[aA] = aA 756 } 757 z = 0; 758 for (aA = 0; aA < 256; ++aA) 759 { 760 z = (z + this.S[aA] + aB[aA % aB.length]) & 255; 761 az = this.S[aA]; 762 this.S[aA] = this.S[z]; 763 this.S[z] = az 764 } 765 this.i = 0; 766 this.j = 0 767 } 768 function a() 769 { 770 var z; 771 this.i = (this.i + 1) & 255; 772 this.j = (this.j + this.S[this.i]) & 255; 773 z = this.S[this.i]; 774 this.S[this.i] = this.S[this.j]; 775 this.S[this.j] = z; 776 return this.S[(z + this.S[this.i]) & 255] 777 } 778 k.prototype.init = e; 779 k.prototype.next = a; 780 function an() 781 { 782 return new k() 783 } 784 var M = 256; 785 var m; 786 var T; 787 var ab; 788 function d(t) 789 { 790 T[ab++]^ = t & 255; 791 T[ab++]^ = (t >> 8) & 255; 792 T[ab++]^ = (t >> 16) & 255; 793 T[ab++]^ = (t >> 24) & 255; 794 if (ab >= M) { 795 ab -= M 796 } 797 } 798 function S() 799 { 800 d(new Date().getTime()) 801 } 802 if (T == null) 803 { 804 T = new Array(); 805 ab = 0; 806 var I; 807 if (navigator.appName == "Netscape" && navigator.appVersion < "5" && window.crypto && typeof (window.crypto.random) === "function") 808 { 809 var F = window.crypto.random(32); 810 for (I = 0; I < F.length; ++I) { 811 T[ab++] = F.charCodeAt(I) & 255; 812 } 813 } 814 while (ab < M) { 815 I = Math.floor(65536 * Math.random()); 816 T[ab++] = I >>> 8; 817 T[ab++] = I & 255 818 } 819 ab = 0; 820 S() 821 } 822 function C() 823 { 824 if (m == null) { 825 S(); 826 m = an(); 827 m.init(T); 828 for (ab = 0; ab < T.length; ++ab) { 829 T[ab] = 0 830 } 831 ab = 0 832 } 833 return m.next() 834 } 835 function au(z) 836 { 837 var t; 838 for (t = 0; t < z.length; ++t) { 839 z[t] = C(); 840 } 841 } 842 function ac() {} 843 ac.prototype.nextBytes = au; 844 function g(z, t) 845 { 846 return new aq(z, t) 847 } 848 function ag(az, aA) 849 { 850 var t = ""; 851 var z = 0; 852 while (z + aA < az.length) { 853 t += az.substring(z, z + aA) + " "; 854 z += aA 855 } 856 return t + az.substring(z, az.length) 857 } 858 function r(t) 859 { 860 if (t < 16) { 861 return "0" + t.toString(16) 862 } 863 else { 864 return t.toString(16); 865 } 866 } 867 function ae(aA, aD) 868 { 869 if (aD < aA.length + 11) { 870 alert("Message too long for RSA"); 871 return null 872 } 873 var aC = new Array(); 874 var az = aA.length - 1; 875 while (az >= 0 && aD > 0) 876 { 877 var aB = aA.charCodeAt(az--); 878 if (aB < 128) { 879 aC[--aD] = aB 880 } 881 else 882 { 883 if ((aB > 127) && (aB < 2048)) { 884 aC[--aD] = (aB & 63) | 128; 885 aC[--aD] = (aB >> 6) | 192 886 } 887 else { 888 aC[--aD] = (aB & 63) | 128; 889 aC[--aD] = ((aB >> 6) & 63) | 128; 890 aC[--aD] = (aB >> 12) | 224 891 } 892 } 893 } 894 aC[--aD] = 0; 895 var z = new ac(); 896 var t = new Array(); 897 while (aD > 2) { 898 t[0] = 0; 899 while (t[0] == 0) { 900 z.nextBytes(t) 901 } 902 aC[--aD] = t[0] 903 } 904 aC[--aD] = 2; 905 aC[--aD] = 0; 906 return new aq(aC) 907 } 908 function L() 909 { 910 this.n = null; 911 this.e = 0; 912 this.d = null; 913 this.p = null; 914 this.q = null; 915 this.dmp1 = null; 916 this.dmq1 = null; 917 this.coeff = null 918 } 919 function o(z, t) 920 { 921 if (z != null && t != null && z.length > 0 && t.length > 0) { 922 this.n = g(z, 16); 923 this.e = parseInt(t, 16) 924 } 925 else { 926 alert("Invalid RSA public key") 927 } 928 } 929 function V(t) 930 { 931 return t.modPowInt(this.e, this.n) 932 } 933 function p(az) 934 { 935 var t = ae(az, (this.n.bitLength() + 7) >> 3); 936 if (t == null) { 937 return null 938 } 939 var aA = this.doPublic(t); 940 if (aA == null) { 941 return null 942 } 943 var z = aA.toString(16); 944 if ((z.length & 1) == 0) { 945 return z 946 } 947 else { 948 return "0" + z; 949 } 950 } 951 L.prototype.doPublic = V; 952 L.prototype.setPublic = o; 953 L.prototype.encrypt = p; 954 this.RSAKey = L; 955 }).call(sinaSSOEncoder);
那么,通过这样一步一步往回退,就解决了密码加密的问题了。
只要给出有效的参数,就可以得到su和sp了。
(三)模拟登录请求:
1)首先是发送 GET请求:http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=&rsakt=mod&client=ssologin.js(v1.4.11)&_=1390463770958 <-- 时间戳
并得到返回:
sinaSSOController.preloginCallBack({"retcode":0,"servertime":1390466275,"pcid":"xd-5cdeb6364e29309bdea78e342a27bb3d2389","nonce":"SBUEXR","pubkey":"EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443","rsakv":"1330428213","exectime":0})
提取其中的servertime、nonce、pubkey和rsakv作为下一步的必要输入。
2)然后发送 POST请求:http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.11)
并填写好下面的这些参数,利用setEntity来设置他们的值
其中的servertime,rsakv,nonce是从上一步得到的,su利用得到的js代码或者用base64加密方式生成,sp利用js代码生成(pubkey,servertime,nonce,password作为参数)
特别注意的是,su和sp有可能会随着微博版本而有所改变,需要注意定期更新或跟踪观察!!
这一步可以得到下述的内容,其中获取location.replace("...")中的内容作为下一步请求的url的:
<html>
<head>
<title>����ͨ��֤</title>
<meta http-equiv="refresh" content="0; url='http://weibo.com/sso/login.php?ssosavestate=1393172391&url=http%3A%2F%2Fweibo.com%2Fajaxlogin.php%3Fframelogin%3D1%26callback%3Dparent.sinaSSOController.feedBackUrlCallBack%26sudaref%3Dweibo.com&ticket=ST-MTc2MDM3OTk2NA==-1390580391-gz-312B05A73C71C256F9C7E79E92F78E72&retcode=0'"/>
<meta http-equiv="Content-Type" content="text/html; charset=GBK" />
</head>
<body bgcolor="#ffffff" text="#000000" link="#0000cc" vlink="#551a8b" alink="#ff0000">
<script type="text/javascript" language="javascript">
location.replace("http://weibo.com/sso/login.php?ssosavestate=1393172391&url=http%3A%2F%2Fweibo.com%2Fajaxlogin.php%3Fframelogin%3D1%26callback%3Dparent.sinaSSOController.feedBackUrlCallBack%26sudaref%3Dweibo.com&ticket=ST-MTc2MDM3OTk2NA==-1390580391-gz-312B05A73C71C256F9C7E79E92F78E72&retcode=0");
</script>
</body>
</html>
这里特别需要注意的是只有当retcode=0的时候,才说明之前的所有步骤操作正确,否则请检查之前的操作哪一步中出错了!(是请求的参数包括"_"等等没有填写好还是url没有填写好)
3)接着发送 POST请求:(刚刚得到的locationreplace中的url)
获取POST请求执行结果中的Header内容,将其中所有的“Set-Cache”值取出来作为下一步正式登陆时的header中的cache值添加以登陆
4)最后以个人主页作为url,发送GET请求(除了cache,还要设置其他的header,具体哪些可以通过使用一次fiddler捕获登陆过程中的header来填写)
若成功,会返回个人主页的页面信息,否则,会得到邀请注册的页面(就是HI谁谁谁,快加入微博关注我之类的话的内容
(1)模拟浏览模块
(一)解析页面获得微博内容并实现翻页
说明:网页版微博有几个更新或者叫翻页的功能,分别是:
1)
利用了:GET http://weibo.com/aj/mblog/fsearch?_wv=5_k=1392707288692116&_t=0&since_id=3679341378537185&__rnd=1392708157261 HTTP/1.1
其中,如_wv=5(貌似与个人账户有关),_k1392707299692116,_t=0(暂时不明)均不是必须的,
since_id:(与end_id相反)上一次,最后一条刷新出来的微博信息的mid,向下到哪条微博为止
__rnd:现在的系统时间
2)
利用了:GET http://weibo.com/aj/mblog/fsearch?_wv=5&page=1&count=15&max_id=3679341991665814&pre_page=1&end_id=3679341378537185&pagebar=0&_k=139270728869244&_t=0&__rnd=1392707310629 HTTP/1.1
3)
利用了:GET http://weibo.com/aj/mblog/fsearch?_wv=5&page=2&count=15&pre_page=1&_k=139270851073081&_t=0&end_id=3679348915284679&end_msign=-1&__rnd=1392708820178 HTTP/1.1
2)和3)都十分相似,实际上他们几乎都是一样的,其中
(二)模拟请求获得微博评论
说明:每次点击评论按钮就是发送了一个Get请求,如下图所示:
请求如下所示:
http://weibo.com/aj/comment/small?_wv=5&act=list&mid=3677461138997998&uid=1760379964&isMain=true&ouid=3926428816&special_feed=2&location=home&_t=0&__rnd=1392295975327%20HTTP/1.1
通过多次比较,发现其中的意义:
_wv:
mid:微博id
uid:登陆用户的id
ouid:微博所有用户的id
special_feed:
__rnd:系统时间
mid、uid和ouid都可以从登陆页面的个人信息和weibo处得到:
通过发送上面的Get请求就可以得到评论的内容了!(!!!注意,返回的内容是Json格式的数据,需要将html的内容提取出来,并转义成,评论内容在<dd></dd>之中)
(三)模拟获得好友列表访问并获得好友主页内容