首页 > 其他分享 >使用Mask R-CNN模型来进行目标检测和实例分割 大规模高分辨率树种单木分割数据集 处理大规模高分辨率树种单木分割任务从14个不同树种类中分割和标注了23,000个树冠

使用Mask R-CNN模型来进行目标检测和实例分割 大规模高分辨率树种单木分割数据集 处理大规模高分辨率树种单木分割任务从14个不同树种类中分割和标注了23,000个树冠

时间:2025-01-03 22:34:44浏览次数:3  
标签:__ 分割 高分辨率 self torch 单木 time import def

单木分割数据集。从14个不同树种类中分割和标注了23,000个树冠,采集使用了DJI Phantom 4 RTK无人机在这里插入图片描述
树种单木分割数据集。从14个不同树种类中分割和标注了23,000个树冠,采集使用了DJI Phantom 4 RTK无人机。正射tif影像,点云、arcgis详细标注单株树木矢量数据(并标明树木类型),数据集共149GB。在这里插入图片描述
大规模高分辨率树种单木分割数据集。从14个不同树种类中分割和标注了23,000个树冠,采集使用了DJI Phantom 4 RTK无人机。正射tif影像,点云、arcgis详细标注单株树木矢量数据(并标明树木类型),数据集共149GB。在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
使用Mask R-CNN模型来进行目标检测和实例分割。以下是详细的步骤和代码示例,包括数据集定义、配置文件、训练脚本等。在这里插入图片描述

目录结构

首先,确保你的项目目录结构如下:

/tree_segmentation_project
    /datasets
        /train
            /images
                *.tif
            /annotations
                *.json
        /valid
            /images
                *.tif
            /annotations
                *.json
    /scripts
        train.py
        datasets.py
        config.yaml
        requirements.txt

config.yaml

配置文件 config.yaml 包含训练参数、数据路径等信息。

# config.yaml
train: ../datasets/train/images/
val: ../datasets/valid/images/

nc: 14
names: ['tree1', 'tree2', 'tree3', 'tree4', 'tree5', 'tree6', 'tree7', 'tree8', 'tree9', 'tree10', 'tree11', 'tree12', 'tree13', 'tree14']

requirements.txt

列出所有需要安装的Python包。

torch>=1.8
torchvision>=0.9
pycocotools
opencv-python
matplotlib
albumentations
labelme2coco
shapely
geopandas
rasterio

datasets.py

定义数据集类以便于加载树种单木分割的数据集,并进行数据增强。

import os
from pathlib import Path
import json
from PIL import Image
import torch
from torch.utils.data import Dataset, DataLoader
import albumentations as A
from albumentations.pytorch.transforms import ToTensorV2
import rasterio
from shapely.geometry import Polygon

class TreeSegmentationDataset(Dataset):
    def __init__(self, root_dir, transform=None):
        self.root_dir = Path(root_dir)
        self.transform = transform
        self.img_files = list((self.root_dir / 'images').glob('*.tif'))
        self.label_files = [Path(str(img_file).replace('images', 'annotations').replace('.tif', '.json')) for img_file in self.img_files]

    def __len__(self):
        return len(self.img_files)

    def __getitem__(self, idx):
        img_path = self.img_files[idx]
        label_path = self.label_files[idx]

        with rasterio.open(img_path) as src:
            image = src.read().transpose(1, 2, 0)

        with open(label_path, 'r') as f:
            annotations = json.load(f)

        boxes = []
        masks = []
        labels = []

        for feature in annotations['features']:
            geometry = feature['geometry']
            if geometry['type'] == 'Polygon':
                polygon = Polygon(geometry['coordinates'][0])
                minx, miny, maxx, maxy = polygon.bounds
                box = [minx, miny, maxx, maxy]
                mask = rasterio.features.rasterize([polygon], out_shape=image.shape[:2], fill=0, default_value=1)
                class_id = int(feature['properties']['class_id']) + 1  # Convert to 1-based index
                boxes.append(box)
                masks.append(mask)
                labels.append(class_id)

        if self.transform:
            transformed = self.transform(image=image, masks=masks, bboxes=boxes, class_labels=labels)
            image = transformed['image']
            masks = transformed['masks']
            boxes = transformed['bboxes']
            labels = transformed['class_labels']

        target = {}
        target['boxes'] = torch.tensor(boxes, dtype=torch.float32)
        target['labels'] = torch.tensor(labels, dtype=torch.int64)
        target['masks'] = torch.tensor(masks, dtype=torch.uint8)

        return image, target

# 定义数据增强
data_transforms = {
    'train': A.Compose([
        A.Resize(width=640, height=640),
        A.HorizontalFlip(p=0.5),
        A.VerticalFlip(p=0.5),
        A.Rotate(limit=180, p=0.7),
        A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.3),
        A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        ToTensorV2(),
    ], bbox_params=A.BboxParams(format='pascal_voc'), mask_params=A.MaskParams()),
    'test': A.Compose([
        A.Resize(width=640, height=640),
        A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        ToTensorV2(),
    ], bbox_params=A.BboxParams(format='pascal_voc'), mask_params=A.MaskParams()),
}

train.py

编写训练脚本来训练Mask R-CNN模型。

import torch
import torch.optim as optim
from torchvision.models.detection import maskrcnn_resnet50_fpn_v2
from datasets import TreeSegmentationDataset, data_transforms
from torch.utils.data import DataLoader
import yaml
import time
import datetime
from collections import defaultdict
from collections import deque
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel as DDP

with open('config.yaml', 'r') as f:
    config = yaml.safe_load(f)

def collate_fn(batch):
    images = [item[0] for item in batch]
    targets = [item[1] for item in batch]
    images = torch.stack(images)
    return images, targets

def train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq):
    model.train()
    metric_logger = MetricLogger(delimiter="  ")
    header = f"Epoch: [{epoch}]"

    for images, targets in metric_logger.log_every(data_loader, print_freq, header):
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())

        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

        metric_logger.update(loss=losses.item(), **loss_dict)

class MetricLogger(object):
    def __init__(self, delimiter="\t"):
        self.meters = defaultdict(SmoothedValue)
        self.delimiter = delimiter

    def update(self, **kwargs):
        for k, v in kwargs.items():
            if isinstance(v, torch.Tensor):
                v = v.item()
            assert isinstance(v, (float, int))
            self.meters[k].update(v)

    def __getattr__(self, attr):
        if attr in self.meters:
            return self.meters[attr]
        if attr in self.__dict__:
            return self.__dict__[attr]
        raise AttributeError(f"'MetricLogger' object has no attribute '{attr}'")

    def log_every(self, iterable, print_freq, header=None):
        i = 0
        if not header:
            header = ""
        start_time = time.time()
        end = time.time()
        iter_time = SmoothedValue(fmt='{avg:.4f}')
        eta_string = SmoothedValue(fmt='{eta}')
        space_fmt = ':' + str(len(str(len(iterable)))) + 'd'
        log_msg = [
            header,
            '[{0' + space_fmt + '}/{1}]',
            'eta: {eta}',
            '{meters}',
            'time: {time}'
        ]
        if torch.cuda.is_available():
            log_msg.append('max mem: {memory:.0f}')
        log_msg = self.delimiter.join(log_msg)
        MB = 1024.0 * 1024.0
        for obj in iterable:
            data_time.update(time.time() - end)
            yield obj
            iter_time.update(time.time() - end)
            if i % print_freq == 0 or i == len(iterable) - 1:
                eta_seconds = iter_time.global_avg * (len(iterable) - i)
                eta_string.update(datetime.timedelta(seconds=int(eta_seconds)))
                if torch.cuda.is_available():
                    print(log_msg.format(
                        i, len(iterable), eta=eta_string, meters=str(self),
                        time=str(iter_time), memory=torch.cuda.max_memory_allocated() / MB))
                else:
                    print(log_msg.format(
                        i, len(iterable), eta=eta_string, meters=str(self),
                        time=str(iter_time)))
            i += 1
            end = time.time()
        total_time = time.time() - start_time
        total_time_str = str(datetime.timedelta(seconds=int(total_time)))
        print('{} Total time: {} ({:.4f} s / it)'.format(
            header, total_time_str, total_time / len(iterable)))

class SmoothedValue(object):
    """Track a series of values and provide access to smoothed values over a
    window or the global series average.
    """

    def __init__(self, window_size=20, fmt=None):
        if fmt is None:
            fmt = "{median:.4f} ({global_avg:.4f})"
        self.deque = deque(maxlen=window_size)
        self.total = 0.0
        self.count = 0
        self.fmt = fmt

    def update(self, value, n=1):
        self.deque.append(value)
        self.count += n
        self.total += value * n

    def synchronize_between_processes(self):
        """
        Warning: does not synchronize the deque!
        """
        if not is_dist_avail_and_initialized():
            return
        t = torch.tensor([self.count, self.total], dtype=torch.float64, device='cuda')
        dist.barrier()
        dist.all_reduce(t)
        t = t.tolist()
        self.count = int(t[0])
        self.total = t[1]

    @property
    def median(self):
        d = torch.tensor(list(self.deque))
        return d.median().item()

    @property
    def avg(self):
        d = torch.tensor(list(self.deque), dtype=torch.float32)
        return d.mean().item()

    @property
    def global_avg(self):
        return self.total / self.count

    @property
    def max(self):
        return max(self.deque)

    @property
    def value(self):
        return self.deque[-1]

    def __str__(self):
        return self.fmt.format(
            median=self.median,
            avg=self.avg,
            global_avg=self.global_avg,
            max=self.max,
            value=self.value)

def is_dist_avail_and_initialized():
    if not dist.is_available():
        return False
    if not dist.is_initialized():
        return False
    return True

def main():
    device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

    dataset_train = TreeSegmentationDataset(root_dir=config['train'], transform=data_transforms['train'])
    dataset_val = TreeSegmentationDataset(root_dir=config['val'], transform=data_transforms['test'])

    data_loader_train = DataLoader(dataset_train, batch_size=2, shuffle=True, num_workers=4, collate_fn=collate_fn)
    data_loader_val = DataLoader(dataset_val, batch_size=2, shuffle=False, num_workers=4, collate_fn=collate_fn)

    model = maskrcnn_resnet50_fpn_v2(pretrained=True)
    num_classes = config['nc'] + 1  # background + number of classes
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    model.roi_heads.box_predictor = torch.nn.Linear(in_features, num_classes)
    in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels
    hidden_layer = 256
    model.roi_heads.mask_predictor = torch.nn.Sequential(
        torch.nn.ConvTranspose2d(in_features_mask, hidden_layer, 2, 2, 0),
        torch.nn.ReLU(),
        torch.nn.Conv2d(hidden_layer, num_classes, 1, 1, 0)
    )
    model.to(device)

    params = [p for p in model.parameters() if p.requires_grad]
    optimizer = optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

    for epoch in range(10):  # number of epochs
        train_one_epoch(model, optimizer, data_loader_train, device=device, epoch=epoch, print_freq=10)

        # save every epoch
        torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
        }, f'model_epoch_{epoch}.pth')

if __name__ == "__main__":
    main()

总结

以上代码涵盖了从数据准备到模型训练的所有步骤。你可以根据需要调整配置文件中的参数,并运行训练脚本来开始训练Mask R-CNN模型。确保你的数据集目录结构符合预期,并且所有的文件路径都是正确的。


文章所有代码仅供参考!


标签:__,分割,高分辨率,self,torch,单木,time,import,def
From: https://blog.csdn.net/2401_86822270/article/details/144838917

相关文章

  • DL00684-山体滑坡实例/语义分割检测完整python代码含数据集
    https://item.taobao.com/item.htm?ft=t&id=872378688356山体滑坡是引发重大自然灾害的常见地质现象,尤其在山区、丘陵等地带,滑坡不仅对人民生命财产安全构成威胁,还会造成环境破坏和基础设施损毁。传统的山体滑坡检测方法依赖人工监测、地质勘探和局部传感器,这些方法不仅反应速度......
  • 力扣 131. 分割回文串
    ......
  • 分割回文串(回溯)
    给你一个字符串 s,请你将 s 分割成一些子串,使每个子串都是回文串。返回 s 所有可能的分割方案。示例1:输入:s="aab"输出:[["a","a","b"],["aa","b"]]示例2:输入:s="a"输出:[["a"]]classSolution{public:vector<vector......
  • 基于动态阈值图像分割方法和与全局阈值相结合的分割方法-【matlab代码】
    一、概述图像阈值分割方法可分为全局阈值与动态阈值两种。当图像目标物体与背景灰度相差较大时,找到一个全局阈值对图像进行分割可取得比较满意的结果。全局阈值的选取多依靠于灰度直方图,常用的方法有最大类间方差法(Otsu)和最大熵法等。当图像比较复杂,图像背景或物体灰度变化......
  • 代码随想录——动态规划13.分割等和子集
    思路难点我只想到了:“找一个子集,每个数取或不取求其和,看是否和另一个子集的和相等”但是实际上既然是两个子集相等,那么只要和等于sum/2即可了!取或不取用01背包,但是不知道怎么用。只有确定了如下四点,才能把01背包问题套到本题上来。背包的体积为sum/2背包要放入的商......
  • 拍照文档处理——达到商用级别的基于语义分割与直线检测拍照文档边缘校正(使用NCNN进行
    概述文档图像的边缘校正是图像处理中的一项重要任务,尤其在文档数字化和自动化扫描过程中,确保文档图像的几何形状和内容准确性具有重要意义。传统的文档图像校正方法通常依赖于手动选择或简单的几何变换。然而,随着深度学习和计算机视觉技术的发展,语义分割与直线检测被广泛应......
  • 图像分割 - Mask R-CNN模型在COCO数据集上的应用
    图像分割-MaskR-CNN模型在COCO数据集上的应用介绍图像分割是计算机视觉中的一种基本任务,旨在将图像划分为不同的区域,并对每个区域进行标记。MaskR-CNN是一种流行的图像分割算法,它扩展了FasterR-CNN,通过增加一个用于预测对象掩码的分支,从而实现实例级的分割。应用使......
  • 面向自动驾驶的实时交通场景深度图像分割与障碍物检测系统的设计与实现
    面向自动驾驶的实时交通场景深度图像分割与障碍物检测系统的设计与实现摘要随着自动驾驶技术的快速发展,视觉感知技术成为其中的核心环节之一。本研究设计并实现了基于YOLOv8的实时交通场景障碍物检测与语义分割系统。通过深度学习技术优化模型性能,系统实现了在复杂交通环......
  • (2-3-03)目标检测与分割:基于深度学习的分割方法
    2.3.5 基于深度学习的分割方法随着人工智能技术的发展和普及,我们也可以使用相关技术实现目标检测与分割功能功能。在现实应用中,基于深度学习的常用分割方法如下。(1)PointSeg:使用PointNet进行点云分割,可以将点云中的不同目标分割出来。(2)PointCNN:使用深度学习方法对点云进行......
  • (2-3-01)目标检测与分割:基于PointNet的目标检测与分割+基于Voxel-based的目标检测与分割
    2.3 目标检测与分割LiDAR目标检测与分割是智能驾驶和机器人领域中的重要任务之一,它涉及从激光雷达(LiDAR)扫描数据中提取和识别目标物体。在本节的内容中,将详细讲解常见的LiDAR目标检测与分割算法。2.3.1 基于PointNet的目标检测与分割PointNet算法的发展推动了智能驾驶......