注意:该项目只展示部分功能,如需了解,文末咨询即可。
本文目录
1.开发环境
开发语言:Python
技术框架:Flask、爬虫
数据库:MySQL
开发工具:PyCharm
2 系统设计
2.1 设计背景
互联网及数字技术的进步不仅改变了人们的生活方式,也深刻影响了文学创作和传播的途径。网络小说作为一种新兴的文学形式,以其独特的传播方式和广泛的受众基础,在文学领域占据了不可忽视的位置。对网络小说进行有效的信息搜集和数据分析,能够为推动网络文学健康发展、丰富文化生活提供有力的技术支撑。因此,本课题致力于通过Python编程语言,结合Scrapy爬虫技术、数据处理技术及Django框架,开发出一套完整的网络小说数据爬取、处理、存储与可视化分析系统。通过本系统的实现,既优化了网络小说的阅读推荐机制,也为文学研究提供了新的视角和方法,具有重要的理论意义和应用价值。
在数字化时代背景下,网络文学作为一种新兴的文学表现形式,利用互联网这一平台迅速发展并形成了独特的文化现象。网络小说,作为网络文学的一个重要分支,以其丰富多样的题材、便捷的阅读方式吸引了大量读者。随着网络小说数量的激增,如何从中筛选出优质作品,确保读者能够高效地获取感兴趣的内容,成为了一个亟待解决的问题。因此,设计并开发一套基于Python的网络小说榜单信息爬取与数据可视化系统,对于优化网络文学资源的利用、促进其健康发展具有重要的实践价值。
随着大数据时代的到来,小说阅读逐渐从纸质阅读时代转为了电子阅读时代,用户习惯的改变也使得网络小说的数量呈现出了指数级爆炸式的增长。早在20年代我国网络小说市场规模就已经达到了249.8亿元,用户规模达到了4.60亿人,占中国网民规模的46.5%。创作的网络小说数量也近乎有2905.9万部。这些数据的展现,也表示了近年来我国网络小说网站数量也呈倍数增长,较为知名的有起点小说网、笔趣阁小说网、飞卢中文网和书旗小说网等,行业的热度也是居高不下。除此之外,我国网文小说在海外地区包括美国,东南亚,俄罗斯等地区也颇具热度。
2.2 设计内容
通过对网络小说榜单信息爬取与数据可视化平台的具体分析和功能需要分析。将平台分为了数据采集、数据清洗和处理、数据可视化以及用户端页面设计和管理端页面设计等几个不同的功能。数据采集就是对起点中文网中小说榜单数据的采集,以及对反爬机制的处理。其中就有这使用合适的请求头、代理池等技术。数据处理分析则包含的有对重复值处理、空值处理、异常处理等方法。数据可视化则在管理端页面用户看板哪里以大屏方式体现。
其中用户端页面包含用户的登录、注册、并将抓取到的小说数据展示在网络小说页面,可对其进行搜索查询。新闻资讯页面可查看管理端发布的新闻热点。每个用户也可将感受发布到留言板。其用户功能模块结构图如图3所示。
管理端页面具备了可以对用户信息的显示并操作,对小说榜单的抓取。对用户提交的留言进行管理。还有对整个系统的管理选项,其中包含发布系统公告等一系列功能。
3 系统页面展示
3.1 用户页面
3.2 管理员页面
3.3 功能展示视频
<iframe allowfullscreen="true" data-mediaembed="csdn" frameborder="0" id="FKVQQ6Kb-1723471650219" src="https://live.csdn.net/v/embed/416932"></iframe>基于Python爬虫的网络小说数据分析系统的设计与实现
4 更多推荐
计算机毕设选题精选汇总
基于Hadoop大数据电商平台用户行为分析与可视化系统
Django+Python数据分析岗位招聘信息爬取与分析
基于微信小程序铁路订票小程序
基于python爬虫的商城商品比价数据分析
5 部分功能代码
5.1 爬虫代码
# # -*- coding: utf-8 -*-
# 数据爬取文件
import scrapy
import pymysql
import pymssql
from ..items import WangluoxiaoshuoItem
import time
from datetime import datetime,timedelta
import datetime as formattime
import re
import random
import platform
import json
import os
import urllib
from urllib.parse import urlparse
import requests
import emoji
import numpy as np
import pandas as pd
from sqlalchemy import create_engine
from selenium.webdriver import ChromeOptions, ActionChains
from scrapy.http import TextResponse
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
# 网络小说
class WangluoxiaoshuoSpider(scrapy.Spider):
name = 'wangluoxiaoshuoSpider'
spiderUrl = 'https://www.qidian.com/finish/chanId21-page{}/'
start_urls = spiderUrl.split(";")
protocol = ''
hostname = ''
realtime = False
def __init__(self,realtime=False,*args, **kwargs):
super().__init__(*args, **kwargs)
self.realtime = realtime=='true'
def start_requests(self):
plat = platform.system().lower()
if not self.realtime and (plat == 'linux' or plat == 'windows'):
connect = self.db_connect()
cursor = connect.cursor()
if self.table_exists(cursor, 'di2zvh33_wangluoxiaoshuo') == 1:
cursor.close()
connect.close()
self.temp_data()
return
pageNum = 1 + 1
for url in self.start_urls:
if '{}' in url:
for page in range(1, pageNum):
next_link = url.format(page)
yield scrapy.Request(
url=next_link,
callback=self.parse
)
else:
yield scrapy.Request(
url=url,
callback=self.parse
)
# 列表解析
def parse(self, response):
_url = urlparse(self.spiderUrl)
self.protocol = _url.scheme
self.hostname = _url.netloc
plat = platform.system().lower()
if not self.realtime and (plat == 'linux' or plat == 'windows'):
connect = self.db_connect()
cursor = connect.cursor()
if self.table_exists(cursor, 'di2zvh33_wangluoxiaoshuo') == 1:
cursor.close()
connect.close()
self.temp_data()
return
list = response.css('ul[class="all-img-list cf"] li')
for item in list:
fields = WangluoxiaoshuoItem()
if '(.*?)' in '''div.book-mid-info h2 a::text''':
try:
fields["name"] = str( re.findall(r'''div.book-mid-info h2 a::text''', item.extract(), re.DOTALL)[0].strip())
except:
pass
else:
try:
fields["name"] = str( self.remove_html(item.css('''div.book-mid-info h2 a::text''').extract_first()))
except:
pass
if '(.*?)' in '''div.book-img-box a img::attr(src)''':
try:
fields["picture"] = str('https:'+ re.findall(r'''div.book-img-box a img::attr(src)''', item.extract(), re.DOTALL)[0].strip())
except:
pass
else:
try:
fields["picture"] = str('https:'+ self.remove_html(item.css('''div.book-img-box a img::attr(src)''').extract_first()))
except:
pass
if '(.*?)' in '''a.go-sub-type::text''':
try:
fields["fenlei"] = str( re.findall(r'''a.go-sub-type::text''', item.extract(), re.DOTALL)[0].strip())
except:
pass
else:
try:
fields["fenlei"] = str( self.remove_html(item.css('''a.go-sub-type::text''').extract_first()))
except:
pass
if '(.*?)' in '''p.intro::text''':
try:
fields["miaoshu"] = str( re.findall(r'''p.intro::text''', item.extract(), re.DOTALL)[0].strip())
except:
pass
else:
try:
fields["miaoshu"] = str( self.remove_html(item.css('''p.intro::text''').extract_first()))
except:
pass
if '(.*?)' in '''div.book-img-box a::attr(href)''':
try:
fields["xqdz"] = str('https:'+ re.findall(r'''div.book-img-box a::attr(href)''', item.extract(), re.DOTALL)[0].strip())
except:
pass
else:
try:
fields["xqdz"] = str('https:'+ self.remove_html(item.css('''div.book-img-box a::attr(href)''').extract_first()))
except:
pass
detailUrlRule = item.css('div.book-img-box a::attr(href)').extract_first()
if self.protocol in detailUrlRule or detailUrlRule.startswith('http'):
pass
elif detailUrlRule.startswith('//'):
detailUrlRule = self.protocol + ':' + detailUrlRule
elif detailUrlRule.startswith('/'):
detailUrlRule = self.protocol + '://' + self.hostname + detailUrlRule
fields["laiyuan"] = detailUrlRule
else:
detailUrlRule = self.protocol + '://' + self.hostname + '/' + detailUrlRule
detailUrlRule ='https:'+ detailUrlRule
yield scrapy.Request(url=detailUrlRule, meta={'fields': fields}, callback=self.detail_parse, dont_filter=True)
# 详情解析
def detail_parse(self, response):
fields = response.meta['fields']
try:
if '(.*?)' in '''span.author::text''':
fields["author"] = str( re.findall(r'''span.author::text''', response.text, re.S)[0].strip().replace('作者:',''))
else:
if 'author' != 'xiangqing' and 'author' != 'detail' and 'author' != 'pinglun' and 'author' != 'zuofa':
fields["author"] = str( self.remove_html(response.css('''span.author::text''').extract_first()).replace('作者:',''))
else:
try:
fields["author"] = str( emoji.demojize(response.css('''span.author::text''').extract_first()).replace('作者:',''))
except:
pass
except:
pass
try:
if '(.*?)' in '''p.count em::text''':
fields["zishu"] = str( re.findall(r'''p.count em::text''', response.text, re.S)[0].strip())
else:
if 'zishu' != 'xiangqing' and 'zishu' != 'detail' and 'zishu' != 'pinglun' and 'zishu' != 'zuofa':
fields["zishu"] = str( self.remove_html(response.css('''p.count em::text''').extract_first()))
else:
try:
fields["zishu"] = str( emoji.demojize(response.css('''p.count em::text''').extract_first()))
except:
pass
except:
pass
try:
if '(.*?)' in '''p.count em:nth-child(3)::text''':
fields["zongtuijian"] = str( re.findall(r'''p.count em:nth-child(3)::text''', response.text, re.S)[0].strip())
else:
if 'zongtuijian' != 'xiangqing' and 'zongtuijian' != 'detail' and 'zongtuijian' != 'pinglun' and 'zongtuijian' != 'zuofa':
fields["zongtuijian"] = str( self.remove_html(response.css('''p.count em:nth-child(3)::text''').extract_first()))
else:
try:
fields["zongtuijian"] = str( emoji.demojize(response.css('''p.count em:nth-child(3)::text''').extract_first()))
except:
pass
except:
pass
try:
if '(.*?)' in '''p.count em:nth-child(5)::text''':
fields["zhoutuijian"] = int( re.findall(r'''p.count em:nth-child(5)::text''', response.text, re.S)[0].strip())
else:
if 'zhoutuijian' != 'xiangqing' and 'zhoutuijian' != 'detail' and 'zhoutuijian' != 'pinglun' and 'zhoutuijian' != 'zuofa':
fields["zhoutuijian"] = int( self.remove_html(response.css('''p.count em:nth-child(5)::text''').extract_first()))
else:
try:
fields["zhoutuijian"] = int( emoji.demojize(response.css('''p.count em:nth-child(5)::text''').extract_first()))
except:
pass
except:
pass
try:
if '(.*?)' in '''div.work-number em.color-font-card::text''':
fields["worknum"] = int( re.findall(r'''div.work-number em.color-font-card::text''', response.text, re.S)[0].strip())
else:
if 'worknum' != 'xiangqing' and 'worknum' != 'detail' and 'worknum' != 'pinglun' and 'worknum' != 'zuofa':
fields["worknum"] = int( self.remove_html(response.css('''div.work-number em.color-font-card::text''').extract_first()))
else:
try:
fields["worknum"] = int( emoji.demojize(response.css('''div.work-number em.color-font-card::text''').extract_first()))
except:
pass
except:
pass
try:
if '(.*?)' in '''div.write em.color-font-card::text''':
fields["writenum"] = str( re.findall(r'''div.write em.color-font-card::text''', response.text, re.S)[0].strip())
else:
if 'writenum' != 'xiangqing' and 'writenum' != 'detail' and 'writenum' != 'pinglun' and 'writenum' != 'zuofa':
fields["writenum"] = str( self.remove_html(response.css('''div.write em.color-font-card::text''').extract_first()))
else:
try:
fields["writenum"] = str( emoji.demojize(response.css('''div.write em.color-font-card::text''').extract_first()))
except:
pass
except:
pass
try:
if '(.*?)' in '''div.days em.color-font-card::text''':
fields["days"] = int( re.findall(r'''div.days em.color-font-card::text''', response.text, re.S)[0].strip())
else:
if 'days' != 'xiangqing' and 'days' != 'detail' and 'days' != 'pinglun' and 'days' != 'zuofa':
fields["days"] = int( self.remove_html(response.css('''div.days em.color-font-card::text''').extract_first()))
else:
try:
fields["days"] = int( emoji.demojize(response.css('''div.days em.color-font-card::text''').extract_first()))
except:
pass
except:
pass
return fields
# 数据清洗
def pandas_filter(self):
engine = create_engine('mysql+pymysql://root:123456@localhost/spiderdi2zvh33?charset=UTF8MB4')
df = pd.read_sql('select * from wangluoxiaoshuo limit 50', con = engine)
# 重复数据过滤
df.duplicated()
df.drop_duplicates()
#空数据过滤
df.isnull()
df.dropna()
# 填充空数据
df.fillna(value = '暂无')
# 异常值过滤
# 滤出 大于800 和 小于 100 的
a = np.random.randint(0, 1000, size = 200)
cond = (a<=800) & (a>=100)
a[cond]
# 过滤正态分布的异常值
b = np.random.randn(100000)
# 3σ过滤异常值,σ即是标准差
cond = np.abs(b) > 3 * 1
b[cond]
# 正态分布数据
df2 = pd.DataFrame(data = np.random.randn(10000,3))
# 3σ过滤异常值,σ即是标准差
cond = (df2 > 3*df2.std()).any(axis = 1)
# 不满⾜条件的⾏索引
index = df2[cond].index
# 根据⾏索引,进⾏数据删除
df2.drop(labels=index,axis = 0)
# 去除多余html标签
def remove_html(self, html):
if html == None:
return ''
pattern = re.compile(r'<[^>]+>', re.S)
return pattern.sub('', html).strip()
# 数据库连接
def db_connect(self):
type = self.settings.get('TYPE', 'mysql')
host = self.settings.get('HOST', 'localhost')
port = int(self.settings.get('PORT', 3306))
user = self.settings.get('USER', 'root')
password = self.settings.get('PASSWORD', '123456')
try:
database = self.databaseName
except:
database = self.settings.get('DATABASE', '')
if type == 'mysql':
connect = pymysql.connect(host=host, port=port, db=database, user=user, passwd=password, charset='utf8')
else:
connect = pymssql.connect(host=host, user=user, password=password, database=database)
return connect
# 断表是否存在
def table_exists(self, cursor, table_name):
cursor.execute("show tables;")
tables = [cursor.fetchall()]
table_list = re.findall('(\'.*?\')',str(tables))
table_list = [re.sub("'",'',each) for each in table_list]
if table_name in table_list:
return 1
else:
return 0
# 数据缓存源
def temp_data(self):
connect = self.db_connect()
cursor = connect.cursor()
sql = '''
insert into `wangluoxiaoshuo`(
id
,name
,picture
,author
,fenlei
,miaoshu
,zishu
,zongtuijian
,zhoutuijian
,worknum
,writenum
,days
,xqdz
)
select
id
,name
,picture
,author
,fenlei
,miaoshu
,zishu
,zongtuijian
,zhoutuijian
,worknum
,writenum
,days
,xqdz
from `di2zvh33_wangluoxiaoshuo`
where(not exists (select
id
,name
,picture
,author
,fenlei
,miaoshu
,zishu
,zongtuijian
,zhoutuijian
,worknum
,writenum
,days
,xqdz
from `wangluoxiaoshuo` where
`wangluoxiaoshuo`.id=`di2zvh33_wangluoxiaoshuo`.id
))
order by rand()
limit 50;
'''
cursor.execute(sql)
connect.commit()
connect.close()
5.2 小说代码
# coding:utf-8
__author__ = "ila"
import logging, os, json, configparser
import time
from datetime import datetime
from flask import request, jsonify,session
from sqlalchemy.sql import func,and_,or_,case
from sqlalchemy import cast, Integer,Float
from api.models.brush_model import *
from . import main_bp
from utils.codes import *
from utils.jwt_auth import Auth
from configs import configs
from utils.helper import *
import random
import smtplib
from email.mime.text import MIMEText
from email.utils import formataddr
from email.header import Header
from utils.baidubce_api import BaiDuBce
from api.models.config_model import config
from flask import current_app as app
from utils.spark_func import spark_read_mysql
from utils.hdfs_func import upload_to_hdfs
from utils.mapreduce1 import MRMySQLAvg
# 注册接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/register", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_register():
if request.method == 'POST':
msg = {'code': normal_code, 'message': 'success', 'data': [{}]}
req_dict = session.get("req_dict")
error = wangluoxiaoshuo.createbyreq(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = "注册用户已存在"
return jsonify(msg)
# 登录接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/login", methods=['GET','POST'])
def python4svlqv70_wangluoxiaoshuo_login():
if request.method == 'GET' or request.method == 'POST':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
req_model = session.get("req_dict")
try:
del req_model['role']
except:
pass
datas = wangluoxiaoshuo.getbyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_model)
if not datas:
msg['code'] = password_error_code
msg['msg']='密码错误或用户不存在'
return jsonify(msg)
req_dict['id'] = datas[0].get('id')
try:
del req_dict['mima']
except:
pass
return Auth.authenticate(Auth, wangluoxiaoshuo, req_dict)
# 登出接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/logout", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_logout():
if request.method == 'POST':
msg = {
"msg": "退出成功",
"code": 0
}
req_dict = session.get("req_dict")
return jsonify(msg)
# 重置密码接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/resetPass", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_resetpass():
'''
'''
if request.method == 'POST':
msg = {"code": normal_code, "msg": "success"}
req_dict = session.get("req_dict")
if req_dict.get('mima') != None:
req_dict['mima'] = '123456'
error = wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
if error != None:
msg['code'] = crud_error_code
msg['msg'] = error
else:
msg['msg'] = '密码已重置为:123456'
return jsonify(msg)
# 获取会话信息接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/session", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_session():
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "data": {}}
req_dict={"id":session.get('params').get("id")}
msg['data'] = wangluoxiaoshuo.getbyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)[0]
return jsonify(msg)
# 分类接口(后端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/page", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_page():
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
req_dict = session.get("req_dict")
userinfo = session.get("params")
try:
__hasMessage__=wangluoxiaoshuo.__hasMessage__
except:
__hasMessage__=None
if __hasMessage__ and __hasMessage__!="否":
tablename=session.get("tablename")
if tablename!="users" and session.get("params")!=None and wangluoxiaoshuo!='chat':
req_dict["userid"]=session.get("params").get("id")
tablename=session.get("tablename")
if tablename=="users" :
try:
pass
except:
pass
else:
mapping_str_to_object = {}
for model in Base_model._decl_class_registry.values():
if hasattr(model, '__tablename__'):
mapping_str_to_object[model.__tablename__] = model
try:
__isAdmin__=mapping_str_to_object[tablename].__isAdmin__
except:
__isAdmin__=None
if __isAdmin__!="是" and session.get("params")!=None:
req_dict["userid"]=session.get("params").get("id")
else:
try:
del req_dict["userid"]
except:
pass
clause_args = []
or_clauses = or_(*clause_args)
msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
msg['data']['pageSize'] = wangluoxiaoshuo.page(wangluoxiaoshuo, wangluoxiaoshuo, req_dict, or_clauses)
return jsonify(msg)
# 排序接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/autoSort", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_autosort():
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
req_dict = session.get("req_dict")
req_dict['sort']='clicktime'
req_dict['order']='desc'
try:
__browseClick__= wangluoxiaoshuo.__browseClick__
except:
__browseClick__=None
if __browseClick__ =='是':
req_dict['sort']='clicknum'
elif __browseClick__ =='时长':
req_dict['sort']='browseduration'
else:
req_dict['sort']='clicktime'
msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
msg['data']['pageSize'] = wangluoxiaoshuo.page(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
return jsonify(msg)
# 分页接口(前端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/list", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_list():
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
req_dict = session.get("req_dict")
if req_dict.__contains__('vipread'):
del req_dict['vipread']
userinfo = session.get("params")
try:
__foreEndList__=wangluoxiaoshuo.__foreEndList__
except:
__foreEndList__=None
if __foreEndList__ and __foreEndList__!="否":
tablename=session.get("tablename")
if tablename!="users" and session.get("params")!=None:
req_dict['userid']=session.get("params").get("id")
try:
__foreEndListAuth__=wangluoxiaoshuo.__foreEndListAuth__
except:
__foreEndListAuth__=None
if __foreEndListAuth__ and __foreEndListAuth__!="否":
tablename=session.get("tablename")
if tablename!="users" and session.get("params")!=None:
req_dict['userid']=session.get("params").get("id")
tablename=session.get("tablename")
if tablename=="users" :
try:
del req_dict["userid"]
except:
pass
else:
mapping_str_to_object = {}
for model in Base_model._decl_class_registry.values():
if hasattr(model, '__tablename__'):
mapping_str_to_object[model.__tablename__] = model
try:
__isAdmin__=mapping_str_to_object[tablename].__isAdmin__
except:
__isAdmin__=None
if __isAdmin__!="是" and session.get("params")!=None:
req_dict["userid"]=session.get("params").get("id")
if 'luntan' in 'wangluoxiaoshuo':
if 'userid' in req_dict.keys():
del req_dict["userid"]
if 'discuss' in 'wangluoxiaoshuo':
if 'userid' in req_dict.keys():
del req_dict["userid"]
msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
msg['data']['pageSize'] = wangluoxiaoshuo.page(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
return jsonify(msg)
# 保存接口(后端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/save", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_save():
'''
'''
if request.method == 'POST':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
for key in req_dict:
if req_dict[key] == '':
req_dict[key] = None
error= wangluoxiaoshuo.createbyreq(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = error
return jsonify(msg)
# 添加接口(前端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/add", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_add():
'''
'''
if request.method == 'POST':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
try:
__foreEndListAuth__=wangluoxiaoshuo.__foreEndListAuth__
except:
__foreEndListAuth__=None
if __foreEndListAuth__ and __foreEndListAuth__!="否":
tablename=session.get("tablename")
if tablename!="users":
req_dict['userid']=session.get("params").get("id")
error= wangluoxiaoshuo.createbyreq(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = error
return jsonify(msg)
# 踩、赞接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/thumbsup/<id_>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_thumbsup(id_):
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
id_=int(id_)
type_=int(req_dict.get("type",0))
rets=wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo,id_)
update_dict={
"id":id_,
}
if type_==1:#赞
update_dict["thumbsupnum"]=int(rets[0].get('thumbsupnum'))+1
elif type_==2:#踩
update_dict["crazilynum"]=int(rets[0].get('crazilynum'))+1
error = wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo, wangluoxiaoshuo, update_dict)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = error
return jsonify(msg)
# 获取详情信息(后端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/info/<id_>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_info(id_):
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": {}}
data = wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo, int(id_))
if len(data)>0:
msg['data']=data[0]
#浏览点击次数
try:
__browseClick__= wangluoxiaoshuo.__browseClick__
except:
__browseClick__=None
if __browseClick__ and "clicknum" in wangluoxiaoshuo.__table__.columns:
click_dict={"id":int(id_),"clicknum":str(int(data[0].get("clicknum") or 0)+1)}
ret=wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo,wangluoxiaoshuo,click_dict)
if ret!=None:
msg['code'] = crud_error_code
msg['msg'] = ret
return jsonify(msg)
# 获取详情信息(前端)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/detail/<id_>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_detail(id_):
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": {}}
data = wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo, int(id_))
if len(data)>0:
msg['data']=data[0]
#浏览点击次数
try:
__browseClick__= wangluoxiaoshuo.__browseClick__
except:
__browseClick__=None
if __browseClick__ and "clicknum" in wangluoxiaoshuo.__table__.columns:
click_dict={"id":int(id_),"clicknum":str(int(data[0].get("clicknum") or 0)+1)}
ret=wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo,wangluoxiaoshuo,click_dict)
if ret!=None:
msg['code'] = crud_error_code
msg['msg'] = ret
return jsonify(msg)
# 更新接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/update", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_update():
'''
'''
if request.method == 'POST':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
if req_dict.get("mima") and "mima" not in wangluoxiaoshuo.__table__.columns :
del req_dict["mima"]
if req_dict.get("password") and "password" not in wangluoxiaoshuo.__table__.columns :
del req_dict["password"]
try:
del req_dict["clicknum"]
except:
pass
error = wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = error
return jsonify(msg)
# 删除接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/delete", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_delete():
'''
'''
if request.method == 'POST':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
error=wangluoxiaoshuo.delete(
wangluoxiaoshuo,
req_dict
)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = error
return jsonify(msg)
# 投票接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/vote/<int:id_>", methods=['POST'])
def python4svlqv70_wangluoxiaoshuo_vote(id_):
'''
'''
if request.method == 'POST':
msg = {"code": normal_code, "msg": "success"}
data= wangluoxiaoshuo.getbyid(wangluoxiaoshuo, wangluoxiaoshuo, int(id_))
for i in data:
votenum=i.get('votenum')
if votenum!=None:
params={"id":int(id_),"votenum":votenum+1}
error=wangluoxiaoshuo.updatebyparams(wangluoxiaoshuo,wangluoxiaoshuo,params)
if error!=None:
msg['code'] = crud_error_code
msg['msg'] = error
return jsonify(msg)
# 分组统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/group/<columnName>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_group(columnName):
'''
分组统计接口
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
userinfo = session.get("params")
msg['data'] = wangluoxiaoshuo.groupbycolumnname(wangluoxiaoshuo,wangluoxiaoshuo,columnName,req_dict)
msg['data'] = msg['data'][:10]
json_filename='wangluoxiaoshuo'+f'_group_{columnName}.json'
where = ' where 1 = 1 '
sql = "SELECT COUNT(*) AS total, " + columnName + " FROM wangluoxiaoshuo " + where + " GROUP BY " + columnName
app.executor.submit(spark_read_mysql, f"({sql}) "+'wangluoxiaoshuo', json_filename)
with open(json_filename, 'w', encoding='utf-8') as f:
f.write(json.dumps(msg['data'], indent=4, ensure_ascii=False))
app.executor.submit(upload_to_hdfs, json_filename)
app.executor.submit(MRMySQLAvg.run)
return jsonify(msg)
# 按值统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/value/<xColumnName>/<yColumnName>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_value(xColumnName, yColumnName):
'''
按值统计接口,
{
"code": 0,
"data": [
{
"total": 10.0,
"shangpinleibie": "aa"
},
{
"total": 20.0,
"shangpinleibie": "bb"
},
{
"total": 15.0,
"shangpinleibie": "cc"
}
]
}
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": {}}
req_dict = session.get("req_dict")
userinfo = session.get("params")
msg['data'] = wangluoxiaoshuo.getvaluebyxycolumnname(wangluoxiaoshuo,wangluoxiaoshuo,xColumnName,yColumnName,req_dict)
msg['data'] = msg['data'][:10]
return jsonify(msg)
# 按日期统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/value/<xColumnName>/<yColumnName>/<timeStatType>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_value_riqi(xColumnName, yColumnName, timeStatType):
'''
按日期统计接口
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": {}}
userinfo = session.get("params")
where = ' where 1 = 1 '
sql = ''
if timeStatType == '日':
sql = "SELECT DATE_FORMAT({0}, '%Y-%m-%d') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m-%d')".format(xColumnName, yColumnName, where, '%Y-%m-%d')
if timeStatType == '月':
sql = "SELECT DATE_FORMAT({0}, '%Y-%m') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m')".format(xColumnName, yColumnName, where, '%Y-%m')
if timeStatType == '年':
sql = "SELECT DATE_FORMAT({0}, '%Y') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y')".format(xColumnName, yColumnName, where, '%Y')
data = db.session.execute(sql)
data = data.fetchall()
results = []
for i in range(len(data)):
result = {
xColumnName: decimalEncoder(data[i][0]),
'total': decimalEncoder(data[i][1])
}
results.append(result)
msg['data'] = results
json_filename='wangluoxiaoshuo'+f'_value_{xColumnName}_{yColumnName}.json'
app.executor.submit(spark_read_mysql, f"({sql}) "+'wangluoxiaoshuo', json_filename)
with open(json_filename, 'w', encoding='utf-8') as f:
f.write(json.dumps(results, indent=4, ensure_ascii=False))
app.executor.submit(upload_to_hdfs, json_filename)
app.executor.submit(MRMySQLAvg.run)
return jsonify(msg)
# 按值统计(多)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/valueMul/<xColumnName>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_valueMul(xColumnName):
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": []}
req_dict = session.get("req_dict")
userinfo = session.get("params")
where = ' where 1 = 1 '
for item in req_dict['yColumnNameMul'].split(','):
sql = "SELECT {0}, sum({1}) AS total FROM wangluoxiaoshuo {2} GROUP BY {0} LIMIT 10".format(xColumnName, item, where)
L = []
data = db.session.execute(sql)
data = data.fetchall()
for i in range(len(data)):
result = {
xColumnName: decimalEncoder(data[i][0]),
'total': decimalEncoder(data[i][1])
}
L.append(result)
msg['data'].append(L)
return jsonify(msg)
# 按值统计(多)
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/valueMul/<xColumnName>/<timeStatType>", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_valueMul_time(xColumnName):
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": []}
req_dict = session.get("req_dict")
userinfo = session.get("params")
timeStatType = req_dict['timeStatType']
where = ' where 1 = 1 '
for item in req_dict['yColumnNameMul'].split(','):
sql = ''
if timeStatType == '日':
sql = "SELECT DATE_FORMAT({0}, '%Y-%m-%d') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m-%d') LIMIT 10".format(xColumnName, item, where, '%Y-%m-%d')
if timeStatType == '月':
sql = "SELECT DATE_FORMAT({0}, '%Y-%m') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y-%m') LIMIT 10".format(xColumnName, item, where, '%Y-%m')
if timeStatType == '年':
sql = "SELECT DATE_FORMAT({0}, '%Y') {0}, sum({1}) total FROM wangluoxiaoshuo {2} GROUP BY DATE_FORMAT({0}, '%Y') LIMIT 10".format(xColumnName, item, where, '%Y')
L = []
data = db.session.execute(sql)
data = data.fetchall()
for i in range(len(data)):
result = {
xColumnName: decimalEncoder(data[i][0]),
'total': decimalEncoder(data[i][1])
}
L.append(result)
msg['data'].append(L)
return jsonify(msg)
# 总数量
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/count", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_count():
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": 0}
req_dict = session.get("req_dict")
userinfo = session.get("params")
msg['data'] = wangluoxiaoshuo.count(wangluoxiaoshuo, wangluoxiaoshuo, req_dict)
return jsonify(msg)
# 统计接口
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/remind/<columnName>/<type>", methods=['GET']) #
def python4svlqv70_wangluoxiaoshuo_remind(columnName,type):
'''
'''
if request.method == 'GET':
msg = {"code": normal_code, 'count': 0}
# 组合查询参数
params = session.get("req_dict")
remindstart = 0
remindend =9999990
if int(type)==1:#数字
if params.get('remindstart') == None and params.get('remindend') != None:
remindstart = 0
remindend = int(params['remindend'])
elif params.get('remindstart') != None and params.get('remindend') == None:
remindstart = int(params['remindstart'])
remindend = 999999
elif params.get('remindstart') == None and params.get('remindend') == None:
remindstart = 0
remindend = 999999
else:
remindstart = params.get('remindstart')
remindend = params.get('remindend')
elif int(type)==2:#日期
current_time=int(time.time())
if params.get('remindstart') == None and params.get('remindend') != None:
starttime=current_time-60*60*24*365*2
params['remindstart'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(starttime))
endtime=current_time+60*60*24*params.get('remindend')
params['remindend'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(endtime))
elif params.get('remindstart') != None and params.get('remindend') == None:
starttime= current_time - 60 * 60 * 24 * params.get('remindstart')
params['remindstart']=time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(starttime))
endtime=current_time+60*60*24*365*2
params['remindend'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(endtime))
elif params.get('remindstart') == None and params.get('remindend') == None:
starttime = current_time - 60 * 60 * 24 * 365 * 2
params['remindstart'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(starttime))
endtime = current_time + 60 * 60 * 24 * 365 * 2
params['remindend'] = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(endtime))
data = wangluoxiaoshuo.getbetweenparams(
wangluoxiaoshuo,
wangluoxiaoshuo,
columnName,
{
"remindStart": remindstart,
"remindEnd": remindend
}
)
msg['count'] = len(data)
return jsonify(msg)
#分类列表
@main_bp.route("/python4svlqv70/wangluoxiaoshuo/lists", methods=['GET'])
def python4svlqv70_wangluoxiaoshuo_lists():
if request.method == 'GET':
msg = {"code": normal_code, "msg": "success", "data": []}
list,_,_,_,_ = wangluoxiaoshuo.page(wangluoxiaoshuo,wangluoxiaoshuo,{})
msg['data'] = list
return jsonify(msg)
标签:__,Python,req,爬取,dict,网络小说,msg,wangluoxiaoshuo,data From: https://blog.csdn.net/IT_YQG_/article/details/141143114源码项目、定制开发、文档报告、PPT、代码答疑
希望和大家多多交流!!
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓