首页 > 其他分享 >scrapy电影天堂练习

scrapy电影天堂练习

时间:2023-10-05 09:55:22浏览次数:58  
标签:movie self 练习 spider item scrapy 天堂 movieName

movie.py

import scrapy
from movieProject.items import MovieprojectItem


class MovieSpider(scrapy.Spider):
    name = 'movie'
    allowed_domains = ['www.ygdy8.net']
    start_urls = ['https://www.ygdy8.net/html/gndy/china/index.html']

    def parse(self, response):
        print("电影天堂")
        movieList = response.xpath('//table//tr[2]/td[2]/b/a[2]')
        for item in movieList:
            movieName = item.xpath('./text()').extract_first()
            movieUrl = 'https://www.ygdy8.net' + item.xpath('./@href').extract_first()
            print(movieName, movieUrl)
        #使用meta将movieName传给secon_parse方法 yield scrapy.Request(url=movieUrl, callback=self.second_parse,meta={'movieName':movieName}) def second_parse(self,response): print("二次解析之前")
      #打开src进入到详情页面,然后获取详情页面的图片地址 secondUrl = response.xpath('//div[@id="Zoom"]//img/@src').extract_first() print("第二次访问", secondUrl) movieName = response.meta['movieName'] movie = MovieprojectItem(movieName=movieName, movieUrl=secondUrl) yield movie

  items.py

class MovieprojectItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    movieName = scrapy.Field()
    movieUrl = scrapy.Field()
    pass

  pipelines.py

class MovieprojectPipeline:
    def open_spider(self, spider):
        self.fp = open('movie.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        self.fp.write(str(item))
        return item

    def close_spider(self, spider):
        self.fp.close()

  

标签:movie,self,练习,spider,item,scrapy,天堂,movieName
From: https://www.cnblogs.com/sgj191024/p/17743094.html

相关文章

  • c语言代码练习16
    //计算a,b间的最大值#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>intayue(inta,intb){if(a>b){returna;}else{returnb;}}intmain(){inta=10;intb=20;intmax=ayue(a,......
  • c语言代码练习15
    //使用togo跳转代码,阻止关机#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>#include<string.h>intmain(){charinput[20]={0};system("shutdown-s-t60");printf("注意!注意!,您的电脑将在1分钟后关机,请输入:我是帅哥。才能取消关机。\n请输入:......
  • c语言代码练习14
    //设计一个猜数字游戏,需要提示猜大了还是小了,直到猜对为止#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>#include<time.h>#include<stdlib.h>voidmenu(){printf("###############################\n");printf("######1.play0.......
  • python练习2 | 类的继承
    点击查看代码#类继承练习:人力系统#员工分为两类,全职员工FullTimeEmployee、兼职员工PartTimeEmployee#全职和兼职都有”姓名,name,工号:id属性#都具备打印信息print_info(打印姓名、工号)方法#全职有月薪monthly_salary属性#兼职有日薪daily_salary属性,每月工作天数......
  • c语言代码练习13
    \\打印九九乘法表#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>intmain(){intx=0;inty=0;intz=0;for(x=1;x<10;x++){for(y=1;y<=x;y++){z=x*y;print......
  • c语言代码练习12
    \\计算1/1-1/2+1/3...-1/100的和#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>intmain(){intx=0;doublesum=0.0;intn=1;for(x=1;x<=100;x++){sum+=n*1.0/x;n=-n;}printf(&......
  • c语言代码练习11
    \\1-100数字中9的数量#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>intmain(){ intx=0; intnum=0; for(x=1;x<=100;x++) { if(x%10==9) { num++; } if(x/10==9) { num++; } } printf("%d",num); return0;} ......
  • c语言代码练习10(改进)
    #define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>#include<string.h>#include<math.h>intmain(){intn=0;inti=0;printf("请输入你想要判断的数字:");scanf("%d",&n);for(i=2;i<sqrt(n)......
  • c语言代码练习10
    \\判断输入的数字是否为素数#define_CRT_SECURE_NO_WARNINGS1#include<stdio.h>#include<string.h>intmain(){intn=0;inti=0;printf("请输入你想要判断的数字:");scanf("%d",&n);for(i=2;i<n;i++){......
  • scrapy当当网练习
    defparse(self,response):print('当当网')li=response.xpath('//ul[@id="component_59"]/li')#src,name,price有个共同的父元素li,但是对于第一个li,没有data-original,所以遍历根据li的索引判断是否为noneforiteminli:......