首页 > 编程语言 >【项目实战】基于Python的网络小说榜单信息爬取与数据可视化系统

【项目实战】基于Python的网络小说榜单信息爬取与数据可视化系统

时间:2024-08-13 18:53:33浏览次数:18  
标签:__ Python req 爬取 dict 网络小说 msg wangluoxiaoshuo data

注意:该项目只展示部分功能,如需了解,文末咨询即可。

本文目录

在这里插入图片描述

1.开发环境

开发语言:Python
技术框架:Flask、爬虫
数据库:MySQL
开发工具:PyCharm

2 系统设计

2.1 设计背景

互联网及数字技术的进步不仅改变了人们的生活方式,也深刻影响了文学创作和传播的途径。网络小说作为一种新兴的文学形式,以其独特的传播方式和广泛的受众基础,在文学领域占据了不可忽视的位置。对网络小说进行有效的信息搜集和数据分析,能够为推动网络文学健康发展、丰富文化生活提供有力的技术支撑。因此,本课题致力于通过Python编程语言,结合Scrapy爬虫技术、数据处理技术及Django框架,开发出一套完整的网络小说数据爬取、处理、存储与可视化分析系统。通过本系统的实现,既优化了网络小说的阅读推荐机制,也为文学研究提供了新的视角和方法,具有重要的理论意义和应用价值。
在数字化时代背景下,网络文学作为一种新兴的文学表现形式,利用互联网这一平台迅速发展并形成了独特的文化现象。网络小说,作为网络文学的一个重要分支,以其丰富多样的题材、便捷的阅读方式吸引了大量读者。随着网络小说数量的激增,如何从中筛选出优质作品,确保读者能够高效地获取感兴趣的内容,成为了一个亟待解决的问题。因此,设计并开发一套基于Python的网络小说榜单信息爬取与数据可视化系统,对于优化网络文学资源的利用、促进其健康发展具有重要的实践价值。
随着大数据时代的到来,小说阅读逐渐从纸质阅读时代转为了电子阅读时代,用户习惯的改变也使得网络小说的数量呈现出了指数级爆炸式的增长。早在20年代我国网络小说市场规模就已经达到了249.8亿元,用户规模达到了4.60亿人,占中国网民规模的46.5%。创作的网络小说数量也近乎有2905.9万部。这些数据的展现,也表示了近年来我国网络小说网站数量也呈倍数增长,较为知名的有起点小说网、笔趣阁小说网、飞卢中文网和书旗小说网等,行业的热度也是居高不下。除此之外,我国网文小说在海外地区包括美国,东南亚,俄罗斯等地区也颇具热度。

2.2 设计内容

通过对网络小说榜单信息爬取与数据可视化平台的具体分析和功能需要分析。将平台分为了数据采集、数据清洗和处理、数据可视化以及用户端页面设计和管理端页面设计等几个不同的功能。数据采集就是对起点中文网中小说榜单数据的采集,以及对反爬机制的处理。其中就有这使用合适的请求头、代理池等技术。数据处理分析则包含的有对重复值处理、空值处理、异常处理等方法。数据可视化则在管理端页面用户看板哪里以大屏方式体现。
其中用户端页面包含用户的登录、注册、并将抓取到的小说数据展示在网络小说页面,可对其进行搜索查询。新闻资讯页面可查看管理端发布的新闻热点。每个用户也可将感受发布到留言板。其用户功能模块结构图如图3所示。
管理端页面具备了可以对用户信息的显示并操作,对小说榜单的抓取。对用户提交的留言进行管理。还有对整个系统的管理选项,其中包含发布系统公告等一系列功能。

3 系统页面展示

3.1 用户页面

在这里插入图片描述
在这里插入图片描述

3.2 管理员页面

在这里插入图片描述
在这里插入图片描述

3.3 功能展示视频

<iframe allowfullscreen="true" data-mediaembed="csdn" frameborder="0" id="FKVQQ6Kb-1723471650219" src="https://live.csdn.net/v/embed/416932"></iframe>

基于Python爬虫的网络小说数据分析系统的设计与实现

4 更多推荐

计算机毕设选题精选汇总
基于Hadoop大数据电商平台用户行为分析与可视化系统
Django+Python数据分析岗位招聘信息爬取与分析
基于微信小程序铁路订票小程序
基于python爬虫的商城商品比价数据分析

5 部分功能代码

5.1 爬虫代码

# # -*- coding: utf-8 -*-

# 数据爬取文件

import scrapy
import pymysql
import pymssql
from ..items import WangluoxiaoshuoItem
import time
from datetime import datetime,timedelta
import datetime as formattime
import re
import random
import platform
import json
import os
import urllib
from urllib.parse import urlparse
import requests
import emoji
import numpy as np
import pandas as pd
from sqlalchemy import create_engine
from selenium.webdriver import ChromeOptions, ActionChains
from scrapy.http import TextResponse
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
# 网络小说
class WangluoxiaoshuoSpider(scrapy.Spider):
    name = 'wangluoxiaoshuoSpider'
    spiderUrl = 'https://www.qidian.com/finish/chanId21-page{}/'
    start_urls = spiderUrl.split(";")
    protocol = ''
    hostname = ''
    realtime = False


    def __init__(self,realtime=False,*args, **kwargs):
        super().__init__(*args, **kwargs)
        self.realtime = realtime=='true'

    def start_requests(self):

        plat = platform.system().lower()
        if not self.realtime and (plat == 'linux' or plat == 'windows'):
            connect = self.db_connect()
            cursor = connect.cursor()
            if self.table_exists(cursor, 'di2zvh33_wangluoxiaoshuo') == 1:
                cursor.close()
                connect.close()
                self.temp_data()
                return
        pageNum = 1 + 1

        for url in self.start_urls:
            if '{}' in url:
                for page in range(1, pageNum):

                    next_link = url.format(page)
                    yield scrapy.Request(
                        url=next_link,
                        callback=self.parse
                    )
            else:
                yield scrapy.Request(
                    url=url,
                    callback=self.parse
                )

    # 列表解析
    def parse(self, response):
        _url = urlparse(self.spiderUrl)
        self.protocol = _url.scheme
        self.hostname = _url.netloc
        plat = platform.system().lower()
        if not self.realtime and (plat == 'linux' or plat == 'windows'):
            connect = self.db_connect()
            cursor = connect.cursor()
            if self.table_exists(cursor, 'di2zvh33_wangluoxiaoshuo') == 1:
                cursor.close()
                connect.close()
                self.temp_data()
                return
        list = response.css('ul[class="all-img-list cf"] li')
        for item in list:
            fields = WangluoxiaoshuoItem()

            if '(.*?)' in '''div.book-mid-info h2 a::text''':
                try:
                    fields["name"] = str( re.findall(r'''div.book-mid-info h2 a::text''', item.extract(), re.DOTALL)[0].strip())

                except:
                    pass
            else:
                try:
                    fields["name"] = str( self.remove_html(item.css('''div.book-mid-info h2 a::text''').extract_first()))

                except:
                    pass
            if '(.*?)' in '''div.book-img-box a img::attr(src)''':
                try:
                    fields["picture"] = str('https:'+ re.findall(r'''div.book-img-box a img::attr(src)''', item.extract(), re.DOTALL)[0].strip())

                except:
                    pass
            else:
                try:
                    fields["picture"] = str('https:'+ self.remove_html(item.css('''div.book-img-box a img::attr(src)''').extract_first()))

                except:
                    pass
            if '(.*?)' in '''a.go-sub-type::text''':
                try:
                    fields["fenlei"] = str( re.findall(r'''a.go-sub-type::text''', item.extract(), re.DOTALL)[0].strip())

                except:
                    pass
            else:
                try:
                    fields["fenlei"] = str( self.remove_html(item.css('''a.go-sub-type::text''').extract_first()))

                except:
                    pass
            if '(.*?)' in '''p.intro::text''':
                try:
                    fields["miaoshu"] = str( re.findall(r'''p.intro::text''', item.extract(), re.DOTALL)[0].strip())

                except:
                    pass
            else:
                try:
                    fields["miaoshu"] = str( self.remove_html(item.css('''p.intro::text''').extract_first()))

                except:
                    pass
            if '(.*?)' in '''div.book-img-box a::attr(href)''':
                try:
                    fields["xqdz"] = str('https:'+ re.findall(r'''div.book-img-box a::attr(href)''', item.extract(), re.DOTALL)[0].strip())

                except:
                    pass
            else:
                try:
                    fields["xqdz"] = str('https:'+ self.remove_html(item.css('''div.book-img-box a::attr(href)''').extract_first()))

                except:
                    pass
            detailUrlRule = item.css('div.book-img-box a::attr(href)').extract_first()
            if self.protocol in detailUrlRule or detailUrlRule.startswith('http'):
                pass
            elif detailUrlRule.startswith('//'):
                detailUrlRule = self.protocol + ':' + detailUrlRule
            elif detailUrlRule.startswith('/'):
                detailUrlRule = self.protocol + '://' + self.hostname + detailUrlRule
                fields["laiyuan"] = detailUrlRule
            else:
                detailUrlRule = self.protocol + '://' + self.hostname + '/' + detailUrlRule
            detailUrlRule ='https:'+ detailUrlRule 
            yield scrapy.Request(url=detailUrlRule, meta={'fields': fields},  callback=self.detail_parse, dont_filter=True)

    # 详情解析
    def detail_parse(self, response):
        fields = response.meta['fields']
        try:
            if '(.*?)' in '''span.author::text''':
                fields["author"] = str( re.findall(r'''span.author::text''', response.text, re.S)[0].strip().replace('作者:',''))

            else:
                if 'author' != 'xiangqing' and 'author' != 'detail' and 'author' != 'pinglun' and 'author' != 'zuofa':
                    fields["author"] = str( self.remove_html(response.css('''span.author::text''').extract_first()).replace('作者:',''))

                else:
                    try:
                        fields["author"] = str( emoji.demojize(response.css('''span.author::text''').extract_first()).replace('作者:',''))

                    except:
                        pass
        except:
            pass
        try:
            if '(.*?)' in '''p.count em::text''':
                fields["zishu"] = str( re.findall(r'''p.count em::text''', response.text, re.S)[0].strip())

            else:
                if 'zishu' != 'xiangqing' and 'zishu' != 'detail' and 'zishu' != 'pinglun' and 'zishu' != 'zuofa':
                    fields["zishu"] = str( self.remove_html(response.css('''p.count em::text''').extract_first()))

                else:
                    try:
                        fields["zishu"] = str( emoji.demojize(response.css('''p.count em::text''').extract_first()))

                    except:
                        pass
        except:
            pass
        try:
            if '(.*?)' in '''p.count em:nth-child(3)::text''':
                fields["zongtuijian"] = str( re.findall(r'''p.count em:nth-child(3)::text''', response.text, re.S)[0].strip())

            else:
                if 'zongtuijian' != 'xiangqing' and 'zongtuijian' != 'detail' and 'zongtuijian' != 'pinglun' and 'zongtuijian' != 'zuofa':
                    fields["zongtuijian"] = str( self.remove_html(response.css('''p.count em:nth-child(3)::text''').extract_first()))

                else:
                    try:
                        fields["zongtuijian"] = str( emoji.demojize(response.css('''p.count em:nth-child(3)::text''').extract_first()))

                    except:
                        pass
        except:
            pass
        try:
            if '(.*?)' in '''p.count em:nth-child(5)::text''':
                fields["zhoutuijian"] = int( re.findall(r'''p.count em:nth-child(5)::text''', response.text, re.S)[0].strip())
            else:
                if 'zhoutuijian' != 'xiangqing' and 'zhoutuijian' != 'detail' and 'zhoutuijian' != 'pinglun' and 'zhoutuijian' != 'zuofa':
                    fields["zhoutuijian"] = int( self.remove_html(response.css('''p.count em:nth-child(5)::text''').extract_first()))
                else:
                    try:
                        fields["zhoutuijian"] = int( emoji.demojize(response.css('''p.count em:nth-child(5)::text''').extract_first()))
                    except:
                        pass
        except:
            pass
        try:
            if '(.*?)' in '''div.work-number em.color-font-card::text''':
                fields["worknum"] = int( re.findall(r'''div.work-number em.color-font-card::text''', response.text, re.S)[0].strip())
            else:
                if 'worknum' != 'xiangqing' and 'worknum' != 'detail' and 'worknum' != 'pinglun' and 'worknum' != 'zuofa':
                    fields["worknum"] = int( self.remove_html(response.css('''div.work-number em.color-font-card::text''').extract_first()))
                else:
                    try:
                        fields["worknum"] = int( emoji.demojize(response.css('''div.work-number em.color-font-card::text''').extract_first()))
                    except:
                        pass
        except:
            pass
        try:
            if '(.*?)' in '''div.write em.color-font-card::text''':
                fields["writenum"] = str( re.findall(r'''div.write em.color-font-card::text''', response.text, re.S)[0].strip())

            else:
                if 'writenum' != 'xiangqing' and 'writenum' != 'detail' and 'writenum' != 'pinglun' and 'writenum' != 'zuofa':
                    fields["writenum"] = str( self.remove_html(response.css('''div.write em.color-font-card::text''').extract_first()))

                else:
                    try:
                        fields["writenum"] = str( emoji.demojize(response.css('''div.write em.color-font-card::text''').extract_first()))

                    except:
                        pass
        except:
            pass
        try:
            if '(.*?)' in '''div.days em.color-font-card::text''':
                fields["days"] = int( re.findall(r'''div.days em.color-font-card::text''', response.text, re.S)[0].strip())
            else:
                if 'days' != 'xiangqing' and 'days' != 'detail' and 'days' != 'pinglun' and 'days' != 'zuofa':
                    fields["days"] = int( self.remove_html(response.css('''div.days em.color-font-card::text''').extract_first()))
                else:
                    try:
                        fields["days"] = int( emoji.demojize(response.css('''div.days em.color-font-card::text''').extract_first()))
                    except:
                        pass
        except:
            pass
        return fields

    # 数据清洗
    def pandas_filter(self):
        engine = create_engine('mysql+pymysql://root:123456@localhost/spiderdi2zvh33?charset=UTF8MB4')
        df = pd.read_sql('select * from wangluoxiaoshuo limit 50', con = engine)

        # 重复数据过滤
        df.duplicated()
        df.drop_duplicates()

        #空数据过滤
        df.isnull()
        df.dropna()

        # 填充空数据
        df.fillna(value = '暂无')

        # 异常值过滤

        # 滤出 大于800 和 小于 100 的
        a = np.random.randint(0, 1000, size = 200)
        cond = (a<=800) & (a>=100)
        a[cond]

        # 过滤正态分布的异常值
        b = np.random.randn(100000)
        # 3σ过滤异常值,σ即是标准差
        cond = np.abs(b) > 3 * 1
        b[cond]

        # 正态分布数据
        df2 = pd.DataFrame(data = np.random.randn(10000,3))
        # 3σ过滤异常值,σ即是标准差
        cond = (df2 > 3*df2.std()).any(axis = 1)
        # 不满⾜条件的⾏索引
        index = df2[cond].index
        # 根据⾏索引,进⾏数据删除
        df2.drop(labels=index,axis = 0)

    # 去除多余html标签
    def remove_html(self, html):
        if html == None:
            return ''
        pattern = re.compile(r'<[^>]+>', re.S)
        return pattern.sub('', html).strip()

    # 数据库连接
    def db_connect(self):
        type = self.settings.get('TYPE', 'mysql')
        host = self.settings.get('HOST', 'localhost')
        port = int(self.settings.get('PORT', 3306))
        user = self.settings.get('USER', 'root')
        password = self.settings.get('PASSWORD', '123456')

        try:
            database = self.databaseName
        except:
            database = self.settings.get('DATABASE', '')

        if type == 'mysql':
            connect = pymysql.connect(host=host, port=port, db=database, user=user, passwd=password, charset='utf8')
        else:
            connect = pymssql.connect(host=host, user=user, password=password, database=database)
        return connect

    # 断表是否存在
    def table_exists(self, cursor, table_name):
        cursor.execute("show tables;")
        tables = [cursor.fetchall()]
        table_list = re.findall('(\'.*?\')',str(tables))
        table_list = [re.sub("'",'',each) for each in table_list]

        if table_name in table_list:
            return 1
        else:
            return 0

    # 数据缓存源
    def temp_data(self):

        connect = self.db_connect()
        cursor = connect.cursor()
        sql = '''
            insert into `wangluoxiaoshuo`(
                id
                ,name
                ,picture
                ,author
                ,fenlei
                ,miaoshu
                ,zishu
                ,zongtuijian
                ,zhoutuijian
                ,worknum
                ,writenum
                ,days
                ,xqdz
            )
            select
                id
                ,name
                ,picture
                ,author
                ,fenlei
                ,miaoshu
                ,zishu
                ,zongtuijian
                ,zhoutuijian
                ,worknum
                ,writenum
                ,days
                ,xqdz
            from `di2zvh33_wangluoxiaoshuo`
            where(not exists (select
                id
                ,name
                ,picture
                ,author
                ,fenlei
                ,miaoshu
                ,zishu
                ,zongtuijian
                ,zhoutuijian
                ,worknum
                ,writenum
                ,days
                ,xqdz
            from `wangluoxiaoshuo` where
                `wangluoxiaoshuo`.id=`di2zvh33_wangluoxiaoshuo`.id
            ))
            order by rand()
            limit 50;
        '''

        cursor.execute(sql)
        connect.commit()
        connect.close()

5.2 小说代码

# coding:utf-8
__author__ = "ila"

import logging, os, json, configparser
import time
from datetime import datetime

from flask import request, jsonify,session
from sqlalchemy.sql import func,and_,or_,case
from sqlalchemy import cast, Integer,Float
from api.models.brush_model import *
from . import main_bp
from utils.codes import *
from utils.jwt_auth import Auth
from configs import configs
from utils.helper import *
import random
import smtplib
from email.mime.text import MIMEText
from email.utils import formataddr
from email.header import Header
from utils.baidubce_api import BaiDuBce
from api.models.config_model import config




from flask import current_app as app
from utils.spark_func import spark_read_mysql
from utils.hdfs_func import upload_to_hdfs
from utils.mapreduce1 import MRMySQLAvg


# 注册接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/register", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_register():
    if request.method == 'POST':
        msg = {'code': normal_code, 'message': 'success', 'data': [{}]}
        req_dict = session.get("req_dict")


        error = wangluoxiaoshuo.createbyreq(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = "注册用户已存在"
        return jsonify(msg)

# 登录接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/login", methods=['GET','POST'])
def python4svlqv70_wangluoxiaoshuo_login():
    if request.method == 'GET' or request.method == 'POST':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        req_model = session.get("req_dict")
        try:
            del req_model['role']
        except:
            pass


        datas = wangluoxiaoshuo.getbyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_model)
        if not datas:
            msg['code'] = password_error_code
            msg['msg']='密码错误或用户不存在'
            return jsonify(msg)


        req_dict['id'] = datas[0].get('id')
        try:
            del req_dict['mima']
        except:
            pass


        return Auth.authenticate(Auth, wangluoxiaoshuo, req_dict)


# 登出接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/logout", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_logout():
    if request.method == 'POST':
        msg = {
            "msg": "退出成功",
            "code": 0
        }
        req_dict = session.get("req_dict")

        return jsonify(msg)

# 重置密码接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/resetPass", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_resetpass():
    '''
    '''
    if request.method == 'POST':
        msg = {"code": normal_code, "msg": "success"}

        req_dict = session.get("req_dict")

        if req_dict.get('mima') != None:
            req_dict['mima'] = '123456'

        error = wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)

        if error != None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        else:
            msg['msg'] = '密码已重置为:123456'
        return jsonify(msg)

# 获取会话信息接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/session", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_session():
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "data": {}}
        req_dict={"id":session.get('params').get("id")}
        msg['data']  = wangluoxiaoshuo.getbyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)[0]

        return jsonify(msg)

# 分类接口(后端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/page", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_page():
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success",  "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
        req_dict = session.get("req_dict")
        userinfo = session.get("params")

        try:
            __hasMessage__=wangluoxiaoshuo.__hasMessage__
        except:
            __hasMessage__=None
        if __hasMessage__ and __hasMessage__!="否":
            tablename=session.get("tablename")
            if tablename!="users" and session.get("params")!=None and wangluoxiaoshuo!='chat':
                req_dict["userid"]=session.get("params").get("id")

        tablename=session.get("tablename")
        if tablename=="users" :
            try:
                pass
            except:
                pass
        else:
            mapping_str_to_object = {}
            for model in Base_model._decl_class_registry.values():
                if hasattr(model, '__tablename__'):
                    mapping_str_to_object[model.__tablename__] = model

            try:
                __isAdmin__=mapping_str_to_object[tablename].__isAdmin__
            except:
                __isAdmin__=None

            if __isAdmin__!="是" and session.get("params")!=None:
                req_dict["userid"]=session.get("params").get("id")
            else:
                try:
                    del req_dict["userid"]
                except:
                    pass



        clause_args = []
        or_clauses = or_(*clause_args)

        msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
        msg['data']['pageSize']  = wangluoxiaoshuo.page(wangluoxiaoshuo, wangluoxiaoshuo, req_dict, or_clauses)

        return jsonify(msg)

# 排序接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/autoSort", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_autosort():
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success",  "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
        req_dict = session.get("req_dict")
        req_dict['sort']='clicktime'
        req_dict['order']='desc'

        try:
            __browseClick__= wangluoxiaoshuo.__browseClick__
        except:
            __browseClick__=None

        if __browseClick__ =='是':
            req_dict['sort']='clicknum'
        elif __browseClick__ =='时长':
            req_dict['sort']='browseduration'
        else:
            req_dict['sort']='clicktime'
        msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
        msg['data']['pageSize']  = wangluoxiaoshuo.page(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)

        return jsonify(msg)

# 分页接口(前端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/list", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_list():
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success",  "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
        req_dict = session.get("req_dict")
        if req_dict.__contains__('vipread'):
            del req_dict['vipread']
            
        userinfo = session.get("params")

        try:
            __foreEndList__=wangluoxiaoshuo.__foreEndList__
        except:
            __foreEndList__=None

        if __foreEndList__ and __foreEndList__!="否":
            tablename=session.get("tablename")
            if tablename!="users" and session.get("params")!=None:
                req_dict['userid']=session.get("params").get("id")

        try:
            __foreEndListAuth__=wangluoxiaoshuo.__foreEndListAuth__
        except:
            __foreEndListAuth__=None

        if __foreEndListAuth__ and __foreEndListAuth__!="否":
            tablename=session.get("tablename")
            if tablename!="users" and session.get("params")!=None:
                req_dict['userid']=session.get("params").get("id")

        tablename=session.get("tablename")
        if tablename=="users" :
            try:
                del req_dict["userid"]
            except:
                pass
        else:
            mapping_str_to_object = {}
            for model in Base_model._decl_class_registry.values():
                if hasattr(model, '__tablename__'):
                    mapping_str_to_object[model.__tablename__] = model

            try:
                __isAdmin__=mapping_str_to_object[tablename].__isAdmin__
            except:
                __isAdmin__=None

            if __isAdmin__!="是" and session.get("params")!=None:
                req_dict["userid"]=session.get("params").get("id")

        if 'luntan' in 'wangluoxiaoshuo':
            if 'userid' in req_dict.keys():
                del req_dict["userid"]


        if 'discuss' in 'wangluoxiaoshuo':
            if 'userid' in req_dict.keys():
                del req_dict["userid"]

        msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
        msg['data']['pageSize']  = wangluoxiaoshuo.page(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)

        return jsonify(msg)

# 保存接口(后端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/save", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_save():
    '''
    '''
    if request.method == 'POST':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        for key in req_dict:
            if req_dict[key] == '':
                req_dict[key] = None

        error= wangluoxiaoshuo.createbyreq(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        return jsonify(msg)

# 添加接口(前端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/add", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_add():
    '''
    '''
    if request.method == 'POST':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        try:
            __foreEndListAuth__=wangluoxiaoshuo.__foreEndListAuth__
        except:
            __foreEndListAuth__=None

        if __foreEndListAuth__ and __foreEndListAuth__!="否":
            tablename=session.get("tablename")
            if tablename!="users":
                req_dict['userid']=session.get("params").get("id")

        error= wangluoxiaoshuo.createbyreq(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        return jsonify(msg)

# 踩、赞接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/thumbsup/<id_>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_thumbsup(id_):
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        id_=int(id_)
        type_=int(req_dict.get("type",0))
        rets=wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo,id_)

        update_dict={
        "id":id_,
        }
        if type_==1:#赞
            update_dict["thumbsupnum"]=int(rets[0].get('thumbsupnum'))+1
        elif type_==2:#踩
            update_dict["crazilynum"]=int(rets[0].get('crazilynum'))+1
        error = wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo, wangluoxiaoshuo, update_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        return jsonify(msg)

# 获取详情信息(后端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/info/<id_>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_info(id_):
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": {}}

        data = wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo, int(id_))
        if len(data)>0:
            msg['data']=data[0]
        #浏览点击次数
        try:
            __browseClick__= wangluoxiaoshuo.__browseClick__
        except:
            __browseClick__=None

        if __browseClick__  and  "clicknum"  in wangluoxiaoshuo.__table__.columns:
            click_dict={"id":int(id_),"clicknum":str(int(data[0].get("clicknum") or 0)+1)}
            ret=wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo,wangluoxiaoshuo,click_dict)
            if ret!=None:
                msg['code'] = crud_error_code
                msg['msg'] = ret
        return jsonify(msg)

# 获取详情信息(前端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/detail/<id_>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_detail(id_):
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": {}}

        data = wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo, int(id_))
        if len(data)>0:
            msg['data']=data[0]

        #浏览点击次数
        try:
            __browseClick__= wangluoxiaoshuo.__browseClick__
        except:
            __browseClick__=None

        if __browseClick__ and "clicknum" in wangluoxiaoshuo.__table__.columns:
            click_dict={"id":int(id_),"clicknum":str(int(data[0].get("clicknum") or 0)+1)}
            ret=wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo,wangluoxiaoshuo,click_dict)
            if ret!=None:
                msg['code'] = crud_error_code
                msg['msg'] = ret
        return jsonify(msg)

# 更新接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/update", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_update():
    '''
    '''
    if request.method == 'POST':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        if req_dict.get("mima") and "mima" not in wangluoxiaoshuo.__table__.columns :
            del req_dict["mima"]
        if req_dict.get("password") and "password" not in wangluoxiaoshuo.__table__.columns :
            del req_dict["password"]
        try:
            del req_dict["clicknum"]
        except:
            pass


        error = wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error


        return jsonify(msg)

# 删除接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/delete", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_delete():
    '''
    '''
    if request.method == 'POST':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")

        error=wangluoxiaoshuo.delete(
            wangluoxiaoshuo,
            req_dict
        )
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        return jsonify(msg)

# 投票接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/vote/<int:id_>", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_vote(id_):
    '''
    '''
    if request.method == 'POST':
        msg = {"code": normal_code, "msg": "success"}


        data= wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo, int(id_))
        for i in data:
            votenum=i.get('votenum')
            if votenum!=None:
                params={"id":int(id_),"votenum":votenum+1}
                error=wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo,wangluoxiaoshuo,params)
                if error!=None:
                    msg['code'] = crud_error_code
                    msg['msg'] = error
        return jsonify(msg)




# 分组统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/group/<columnName>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_group(columnName):
    '''
    分组统计接口
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        userinfo = session.get("params")


        msg['data'] = wangluoxiaoshuo.groupbycolumnname(wangluoxiaoshuo,wangluoxiaoshuo,columnName,req_dict)
        msg['data'] = msg['data'][:10]

        json_filename='wangluoxiaoshuo'+f'_group_{columnName}.json'

        where = ' where 1 = 1 '
        sql = "SELECT COUNT(*) AS total, " + columnName + " FROM wangluoxiaoshuo " + where + " GROUP BY " + columnName
        app.executor.submit(spark_read_mysql, f"({sql}) "+'wangluoxiaoshuo', json_filename)
        with open(json_filename, 'w', encoding='utf-8') as f:
            f.write(json.dumps(msg['data'], indent=4, ensure_ascii=False))
        app.executor.submit(upload_to_hdfs, json_filename)
        app.executor.submit(MRMySQLAvg.run)
        return jsonify(msg)

# 按值统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/value/<xColumnName>/<yColumnName>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_value(xColumnName, yColumnName):
    '''
    按值统计接口,
    {
        "code": 0,
        "data": [
            {
                "total": 10.0,
                "shangpinleibie": "aa"
            },
            {
                "total": 20.0,
                "shangpinleibie": "bb"
            },
            {
                "total": 15.0,
                "shangpinleibie": "cc"
            }
        ]
    }
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": {}}
        req_dict = session.get("req_dict")
        userinfo = session.get("params")


        msg['data'] = wangluoxiaoshuo.getvaluebyxycolumnname(wangluoxiaoshuo,wangluoxiaoshuo,xColumnName,yColumnName,req_dict)
        msg['data'] = msg['data'][:10]
        return jsonify(msg)

# 按日期统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/value/<xColumnName>/<yColumnName>/<timeStatType>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_value_riqi(xColumnName, yColumnName, timeStatType):
    '''
    按日期统计接口
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": {}}

        userinfo = session.get("params")
        where = ' where 1 = 1 '
        sql = ''
        if timeStatType == '日':
            sql = "SELECT DATE_FORMAT({0}, '%Y-%m-%d') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m-%d')".format(xColumnName, yColumnName, where, '%Y-%m-%d')

        if timeStatType == '月':
            sql = "SELECT DATE_FORMAT({0}, '%Y-%m') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m')".format(xColumnName, yColumnName, where, '%Y-%m')

        if timeStatType == '年':
            sql = "SELECT DATE_FORMAT({0}, '%Y') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y')".format(xColumnName, yColumnName, where, '%Y')

        data = db.session.execute(sql)
        data = data.fetchall()
        results = []
        for i in range(len(data)):
            result = {
                xColumnName: decimalEncoder(data[i][0]),
                'total': decimalEncoder(data[i][1])
            }
            results.append(result)
            
        msg['data'] = results
        json_filename='wangluoxiaoshuo'+f'_value_{xColumnName}_{yColumnName}.json'
        app.executor.submit(spark_read_mysql, f"({sql}) "+'wangluoxiaoshuo', json_filename)
        with open(json_filename, 'w', encoding='utf-8') as f:
            f.write(json.dumps(results, indent=4, ensure_ascii=False))
        app.executor.submit(upload_to_hdfs, json_filename)
        app.executor.submit(MRMySQLAvg.run)

        return jsonify(msg)

# 按值统计(多)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/valueMul/<xColumnName>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_valueMul(xColumnName):

    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": []}

        req_dict = session.get("req_dict")
        userinfo = session.get("params")

        where = ' where 1 = 1 '

        for item in req_dict['yColumnNameMul'].split(','):
            sql = "SELECT {0}, sum({1}) AS total FROM wangluoxiaoshuo {2} GROUP BY {0} LIMIT 10".format(xColumnName, item, where)
            L = []
            data = db.session.execute(sql)
            data = data.fetchall() 
            for i in range(len(data)):
                result = {
                    xColumnName: decimalEncoder(data[i][0]),
                    'total': decimalEncoder(data[i][1])
                }
                L.append(result)
            msg['data'].append(L)

        return jsonify(msg)

# 按值统计(多)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/valueMul/<xColumnName>/<timeStatType>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_valueMul_time(xColumnName):

    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": []}

        req_dict = session.get("req_dict")
        userinfo = session.get("params")
        timeStatType = req_dict['timeStatType']
        where = ' where 1 = 1 '

        for item in req_dict['yColumnNameMul'].split(','):
            sql = ''
            if timeStatType == '日':
                sql = "SELECT DATE_FORMAT({0}, '%Y-%m-%d') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m-%d') LIMIT 10".format(xColumnName, item, where, '%Y-%m-%d')

            if timeStatType == '月':
                sql = "SELECT DATE_FORMAT({0}, '%Y-%m') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m') LIMIT 10".format(xColumnName, item, where, '%Y-%m')

            if timeStatType == '年':
                sql = "SELECT DATE_FORMAT({0}, '%Y') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y') LIMIT 10".format(xColumnName, item, where, '%Y')
            L = []
            data = db.session.execute(sql)
            data = data.fetchall() 
            for i in range(len(data)):
                result = {
                    xColumnName: decimalEncoder(data[i][0]),
                    'total': decimalEncoder(data[i][1])
                }
                L.append(result)
            msg['data'].append(L)

        return jsonify(msg)


# 总数量
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/count", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_count():
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success",  "data": 0}
        req_dict = session.get("req_dict")
        userinfo = session.get("params")


        msg['data']  = wangluoxiaoshuo.count(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)

        return jsonify(msg)






# 统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/remind/<columnName>/<type>", methods=['GET'])  #
def python4svlqv70_wangluoxiaoshuo_remind(columnName,type):
    '''
    '''
    if request.method == 'GET':
        msg = {"code": normal_code, 'count': 0}
        # 组合查询参数
        params = session.get("req_dict")
        remindstart = 0
        remindend =9999990
        if int(type)==1:#数字
            if params.get('remindstart') == None and params.get('remindend') != None:
                remindstart = 0
                remindend = int(params['remindend'])
            elif params.get('remindstart') != None and params.get('remindend') == None:
                remindstart = int(params['remindstart'])
                remindend = 999999
            elif params.get('remindstart') == None and params.get('remindend') == None:
                remindstart = 0
                remindend = 999999
            else:
                remindstart = params.get('remindstart')
                remindend =  params.get('remindend')
        elif int(type)==2:#日期
            current_time=int(time.time())
            if params.get('remindstart') == None and params.get('remindend') != None:
                starttime=current_time-60*60*24*365*2
                params['remindstart'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(starttime))
                endtime=current_time+60*60*24*params.get('remindend')
                params['remindend'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(endtime))

            elif params.get('remindstart') != None and params.get('remindend') == None:
                starttime= current_time - 60 * 60 * 24 * params.get('remindstart')
                params['remindstart']=time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(starttime))
                endtime=current_time+60*60*24*365*2
                params['remindend'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(endtime))
            elif params.get('remindstart') == None and params.get('remindend') == None:
                starttime = current_time - 60 * 60 * 24 * 365 * 2
                params['remindstart'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(starttime))
                endtime = current_time + 60 * 60 * 24 * 365 * 2
                params['remindend'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(endtime))

        data = wangluoxiaoshuo.getbetweenparams(
            wangluoxiaoshuo,
            wangluoxiaoshuo,
            columnName,
            {
                "remindStart": remindstart,
                "remindEnd": remindend
            }
        )

        msg['count'] = len(data)
        return jsonify(msg)






#分类列表
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/lists", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_lists():
    if request.method == 'GET':
        msg = {"code": normal_code, "msg": "success", "data": []}
        list,_,_,_,_ = wangluoxiaoshuo.page(wangluoxiaoshuo,wangluoxiaoshuo,{})
        msg['data'] = list
        return jsonify(msg)



源码项目、定制开发、文档报告、PPT、代码答疑
希望和大家多多交流!!
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓

标签:__,Python,req,爬取,dict,网络小说,msg,wangluoxiaoshuo,data
From: https://blog.csdn.net/IT_YQG_/article/details/141143114

相关文章

  • python-print()函数
     一、print()函数概述该函数的语法如下:print(*objects,sep='',end='\n',file=sys.stdout)参数的具体含义如下:objects--表示输出的对象。输出多个对象时,需要用,(逗号)分隔。sep--用来间隔多个对象。end--用来设定以什么结尾。默认值是换行符\n,我们可以换成其......
  • python实现迷宫最佳路径规划
    在Python中实现迷宫路径的最佳路径规划,我们通常可以使用图搜索算法,如广度优先搜索(BFS)或更高效的A搜索算法。A算法因其结合了最佳优先搜索(如Dijkstra算法)和启发式信息(如曼哈顿距离或欧几里得距离)来评估节点的潜力,所以在寻找最短路径时非常有效。下面将展示如何使用A*算法在Pyth......
  • Python 栅格数据处理教程(一)
    本文将介绍通过ArcGISPro的Python模块(arcpy)对栅格数据定义投影及裁剪的方法。1数据来源及介绍降水量数据:国家青藏高原科学数据中心的中国1km分辨率逐月降水量数据集。行政区数据:天地图行政区划数据中的吉林省边界面数据,该数据为GeoJSON格式,可通过QGIS等软件将其转换......
  • inscode的会员计划的python环境问题【版本3.9.16】无法升级python
    购买了inscode的会员计划后,部署python项目遇到python环境无法升级的问题inscode的会员计划的环境是3.9.16,但是项目用的例子需要3.10以上的版本,最终本人也无法完全解决,虽然手动安装了python3.10,一切都可以实现,但是最后环境自动恢复到3.9版本,导致自己手动配置的全废了,本帖子......
  • 【复现成功版✌】【Python开发】写隐藏文件管理工具(隐藏自己的重要文件)
    教程......
  • 基于Flask与MySQL的在线问答系统的设计与实现/Python/计算机毕业设计
    摘要为了更够是学生更快更方便的获取问题答案,开发一款在线问答系统供学生使用。基于系统的业务要求,系统开发平台为Windows10,主要使用Python语言进行开发,以及Python开发语言的框架Flask,使用MySQL作为数据库存储相关数据。开发软件为PyCharm,与此同时运用Navicat作为数据库管......
  • 使用 Flask、Celery 和 Python 实现每月定时任务
    为了创建一个使用Flask、Celery和Python实现的每月定时任务,我们需要按照以下步骤进行:1.安装必要的库我们需要安装Flask、Celery和Redis(作为消息代理)。我们可以使用pip来安装它们:bash复制代码pipinstallflaskceleryredis2.设置Flask和Celery首先,我们需要设......
  • Python网页应用开发神器fac 0.3.0全新版本发布
    大家好我是费老师,在Python生态中,有很多以Python为主要开发语言,实现网页应用开发的框架,其中最为知名的有Dash、flet、streamlit、gradio、nicegui等。如果综合考虑流行度、开发效率、开发自由度、相关生态成熟度、可拓展性、安全性等各方面的能力,Dash是其中天花板级别的存在,这也是......
  • python连接钉钉自动化提交OA审批
    一、准备工作1、安装阿里云支持包,点击跳转:https://open.dingtalk.com/document/resourcedownload/download-server-sdk2、注册钉钉开发者账号,点击链接:https://open.dingtalk.com/3、获取AK,SK4、USERID通过企业管理后台可以查看每个用户的ID或者通过接口获取5、PROCESS_CODE......
  • 【Python机器学习】树回归——使用Python的tkinter库创建GUI
    机器学习给我们提供了一些强大的工具,能从未知数据中抽取出有用的信息。因此,能否这些信息以易于人们理解的方式呈现十分重要。如果人们可以直接与算法和数据交互,将可以比较轻松的进行解释。其中一个能够同时支持数据呈现和用户交互的方式就是构建一个图形用户界面(GUI)。利用GUI......