简单的python格网算法算数据密集度demo

时间：2023-02-19 16:44:30浏览次数：49

标签：index gridSize point python demo 格网 bounds time data

# 格网算法计算数据集区域数据密集度
import time
import random
import numpy as np
import pandas as pd

# 模拟数据集
def create_data():
    data_x = []
    data_y = []
    data = []
    for i in range(300000):
        x = random.randrange(0, 300000)
        y = random.randrange(-1500, 1500)
        data_x.append(x)
        data_y.append(y)
        data.append([x, y])
    return data_x, data_y, data

# 计算网格数据密集度
def calculate_density(gridSize, bounds):
    data_x, data_y, data = create_data()

    # 计算网格边界
    x = np.arange(bounds[0][0],
                  bounds[1][0] + gridSize,
                  gridSize)
    y = np.arange(bounds[0][1],
                  bounds[1][1] + gridSize,
                  gridSize)

    # 使用pandas构建网格
    grid = pd.DataFrame(0, index=x[:-1], columns=y[:-1])

    # 将数据分配到网格中
    for point in data:
        if point[0] < bounds[0][0] \
            or point[0] > bounds[1][0] \
            or point[1] < bounds[0][1] \
            or point[1] > bounds[1][1]:
            continue

        # 计算数据在那个网格内
        x_index = int((point[0] - bounds[0][0]) // gridSize)
        y_index = int((point[1] - bounds[0][1]) // gridSize)

        # 将网格计数 +1
        grid.iloc[x_index, y_index] += 1

    # 计算每个网格的密度
    densities = grid.to_numpy() / (gridSize * gridSize)

    # 将密度添加到数据中
    for point in data:
        if point[0] < bounds[0][0] \
            or point[0] > bounds[1][0] \
            or point[1] < bounds[0][1] \
            or point[1] > bounds[1][1]:
            continue

        # 计算数据在那个网格内
        x_index = int((point[0] - bounds[0][0]) // gridSize)
        y_index = int((point[1] - bounds[0][1]) // gridSize)

        point.append(densities[x_index, y_index])

    return densities, data

if __name__ == "__main__":
    start_time = time.time()
    densities, data = calculate_density(100,
                                  [[0, -1500], [300000, 1500]])
    end_time = time.time()
    print("消耗的时间:", end_time - start_time)
    print(densities)

标签：index,gridSize,point,python,demo,格网,bounds,time,data
From： https://www.cnblogs.com/shallow-dreamer/p/17135005.html

爬虫利用Xpath解析练习demo
爬取新闻页的简要信息importrequestsfromlxmlimportetreefromlxml.etreeimport_ElementBase_url="https://news.cnblogs.com"Base_path="/n/page/"heade......
python代码规范PEP8
1、引言本文档给出了Python编码规约，主要Python发行版中的标准库即遵守该规约。对于C代码风格的Python程序，请参阅配套的C代码风格指南。本文档和PEP257（文档字......
跟着廖雪峰学python 005
函数的调用、定义、参数编辑 #######命名关键字参数没完abs()函数：绝对值>>>abs(100)100>>>abs(-20)20max()函数：接收任意多个参数，并返回最大的那个......
使用python批量转换.jfif文件为.jpg
python代码如下，有需要的自行取用：需要引入Image库，方法是：pipinstallImage importosfromPILimportImageroot_dir=r'C:\temp'deflist_files(root_dir):......
python--matplotlib(1)
前言 Matplotlib画图工具的官网地址是http://matplotlib.org/Python环境下实现Matlab制图功能的第三方库，需要numpy库的支持，支持用户方便设计出二维、三维数据的图形显示。......
运行python程序时显示killed
这是由于内存不足导致，以下命令可以拓展内存：sudoswapoff/swapfilesudoddif=/dev/zeroof=/swapfilebs=1Mcount=30720oflag=appendconv=notruncsudomkswap/sw......
python正则表达式
正则表达式是一个特殊的字符序列，它能帮助你方便的检查一个字符串是否与某种模式匹配。python中提供了re模块用于正则表达式的匹配1、re.findall：在字符串中找到正则表达式所......
python 导出依赖包
freeze方式pip自带的命令、此方式可将环境内所有已安装依赖包导出到文件中、适合于虚拟环境workon$name#进入虚拟环境pipfreeze>r......
TensorRT教程（六）使用Python和C++部署YOLOv5的TensorRT模型
前言今天这里主要介绍使用Python部署TensorRT的模型以及使用C++部署TensorRT的模型这两种方法。其实在日常部署的工作中，更多是使用C++进行部署，因为这样可以更加丝滑地迁......
软件测试|Python列表的使用，你都会了吗？（二）
前言上一篇文章我们主要讲述了Python列表的一些基本操作，本篇文章我们继续讲述Python列表的其他操作。列表中添加元素Python提供了append()方法用于列表添加元素。代码如下:......

简单的python格网算法算数据密集度demo

相关文章

赞助商

阅读排行