最近对猫耳FM上的音频比较感兴趣 ,但是开个浏览器挂着又会浪费资源,所以就想了个办法把它们都下到本地
上代码:
import requests import json headers = { 'cookie': '_uab_collina=155114329401284895128454; token=5c74919c61926876aac815ad%7Cc5ccff118575ba4e%7C1551143324%7Cd9d7d6e29cf9f1a4; MSESSID=43476496d6593f029a7e094286d139d4; Hm_lvt_91a4e950402ecbaeb38bd149234eb7cc=1553064079,1553159901,1553759386,1553821257; SL_GWPT_Show_Hide_tmp=1; SL_wptGlobTipTmp=1; _csrf=65fb70e19ecd06f08194768c3057d6f4e61e8e6a822089831dce731ab9dca8bea%3A2%3A%7Bi%3A0%3Bs%3A5%3A%22_csrf%22%3Bi%3A1%3Bs%3A32%3A%22cu24pCUAzqwkYg_iToT5CjlcIjZGr0-A%22%3B%7D; SERVERID=f2284e528b819c0b08ba6b96be65b36e|1553826549|1553821255; Hm_lpvt_91a4e950402ecbaeb38bd149234eb7cc=1553826552', 'referer': 'https://www.missevan.com/2945631/', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36', 'x-requested-with': 'XMLHttpRequest' } url = 'https://www.missevan.com/2945631/getusersound?page_size=10' response = requests.get(url,headers = headers) response = json.loads(response.text).get('info') referer = 0 headers2 = { 'cookie': '_uab_collina=155114329401284895128454; token=5c74919c61926876aac815ad%7Cc5ccff118575ba4e%7C1551143324%7Cd9d7d6e29cf9f1a4; MSESSID=43476496d6593f029a7e094286d139d4; Hm_lvt_91a4e950402ecbaeb38bd149234eb7cc=1553064079,1553159901,1553759386,1553821257; SL_GWPT_Show_Hide_tmp=1; SL_wptGlobTipTmp=1; _csrf=65fb70e19ecd06f08194768c3057d6f4e61e8e6a822089831dce731ab9dca8bea%3A2%3A%7Bi%3A0%3Bs%3A5%3A%22_csrf%22%3Bi%3A1%3Bs%3A32%3A%22cu24pCUAzqwkYg_iToT5CjlcIjZGr0-A%22%3B%7D; SERVERID=f2284e528b819c0b08ba6b96be65b36e|1553828124|1553821255; Hm_lpvt_91a4e950402ecbaeb38bd149234eb7cc=1553828128', 'referer': referer, 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36', 'x-requested-with': 'XMLHttpRequest' } for item in response['Datas']: ref = 'https://www.missevan.com/sound/player?id=' + str(item['id']) headers2['referer'] = ref url = 'https://www.missevan.com/sound/getsound?soundid=' + str(item['id']) response = requests.get(url,headers2) response = json.loads(response.text).get('info') url = 'http://static.missevan.com/' + str(response['sound']['soundurl_128']) with open('C:\Users\16609\Desktop\MRFM\' + str(item['id']) + '.mp3','wb') as f: response = requests.get(url) f.write(response.content)
做完之后发现一个很笨的问题,虽然音频风格都一样所以分类也没什么必要,但是文件名还是很有必要的,因为我也想知道新的与旧的......,修改是很简单的抓取的时候顺带从json文件里面保存一下就行了,跟我保存id命名是一样的,所以这里就不提了。