爬虫,  Python

Python-爬取图片

导入库文件:

import requests

定义爬取网页地址:

url = 'http://github.com/favicon.ico'

设置代理headers(防止出现403 forbidden反爬虫):

headers = {
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36"
}

请求网站获取响应:

data = requests.get(url,headers=headers).content

设置存储文件名为最后一个’/’后的文字:

path=url.split('/')[-1]

写入文件:

with open(path, 'wb') as f:
     f.write(data)
     f.close()

完整代码:

import requests
url = 'http://github.com/favicon.ico'
headers = {
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36"
}
data = requests.get(url,headers=headers).content
path=url.split('/')[-1]
with open(path, 'wb') as f:
     f.write(data)
     f.close()

爬取文件:

原图片

favicon.ico

留言