初学Python，一个简单的图片下载爬虫

March 3, 2019

4630views

266 words

## 前言
刚学Python就迫不及待的写了个Python爬虫，虽然只是简单的下载图片。
## 代码
    #-*- coding:utf-8 -*-
    import re
    import requests
    j=1
    #j用来控制页数
    i = 0
    
    url = 'https://wall.alphacoders.com/search.php?search=sound%21+euphonium'
    #这是要爬取的图片所在网站
    while(j<=12):
    
      html = requests.get(url).text
      pattern = re.compile(r'https://images[0-9]*.alphacoders.com/[0-9]{3}/+[^\s]*.*g"')
    #这里是匹配预览图链接
      pic_url = pattern.findall(html)
      y=[]
      for x in pic_url:
        x1=x.replace('thumb-350-','')
        y.append(x1[0:-1])
    #将预览图片链接进行处理，得到高清图片的链接，为什么这么做后面讲
      print('***************************')
    #图片链接处理
      
    #以下是下载图片并存到pictures文件夹
      for each in y:
        print (each)
        try:
          pic= requests.get(each, timeout=5000)
        except requests.exceptions.ConnectionError:
          print ('!!!!!!!!!!!!')
          continue
        string = 'pictures\\'+str(i) + '.jpg'
        fp = open(string,'wb')
        fp.write(pic.content)
        fp.close()
        i += 1
      
      j+=1
      url=url+'&page='+str(j)
      print(url)
## 依赖环境及包
1. Python 3.+
2. pip
3. requests
4. win10、win7、Linux

## 爬取的网站
[https://alphacoders.com/](https://alphacoders.com/)
## 使用方法
仅在win10测试过，其他系统理论可行
1. 将代码复制到新建的文本文件中，保存并重命名为test.py
2. 在爬取的网站搜索你需要的图片，将此时的链接替换到test.py中
`url = 'https://wall.alphacoders.com/search.php?search=sound%21+euphonium'` 
部分的链接，并保存
3. 在test.py所在目录创建pictures文件夹
4. 在test.py所在目录打开powershell，输入
`python3 test.py`
回车

## 参数设置
参见代码注释
## 错误信息
请检查是否安装requests、python版本是否为3.6.*

Last modification：March 21st, 2019 at 10:02 am

If you think my article is useful to you, please feel free to appreciate

初学Python，一个简单的图片下载爬虫