首页 > 其他分享 >Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learnin(CoLA论文阅读、代码实现)

Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learnin(CoLA论文阅读、代码实现)

时间:2024-06-12 15:59:15浏览次数:13  
标签:via idx Self Contrastive dataset np str test import

Paper

Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learnin
在这里插入图片描述

代码实现

异常注入

inject_anomaly.py

inject_anomaly.py注入异常过程,处理原始数据集,并添加结构和属性扰动,注入结构属性异常。

import numpy as np
import scipy.sparse as sp
import random
import scipy.io as sio
import argparse
import pickle as pkl
import networkx as nx
import sys
import os
import os.path as osp
from sklearn import preprocessing
from scipy.spatial.distance import euclidean

def dense_to_sparse(dense_matrix):
    shape = dense_matrix.shape
    row = []
    col = []
    data = []
    for i, r in enumerate(dense_matrix):
        for j in np.where(r > 0)[0]:
            row.append(i)
            col.append(j)
            data.append(dense_matrix[i,j])

    sparse_matrix = sp.coo_matrix((data, (row, col)), shape=shape).tocsc()
    return sparse_matrix

def parse_index_file(filename):
    """Parse index file."""
    index = []
    for line in open(filename):
        index.append(int(line.strip()))
    return index

def load_citation_datadet(dataset_str):
    names = ['x', 'y', 'tx', 'ty', 'allx', 'ally', 'graph']
    objects = []
    for i in range(len(names)):
        with open("raw_dataset/{}/ind.{}.{}".format(dataset_str, dataset_str, names[i]), 'rb') as f:
            if sys.version_info > (3, 0):
                objects.append(pkl.load(f, encoding='latin1'))
            else:
                objects.append(pkl.load(f))

    x, y, tx, ty, allx, ally, graph = tuple(objects)
    test_idx_reorder = parse_index_file("raw_dataset/{}/ind.{}.test.index".format(dataset_str, dataset_str))
    test_idx_range = np.sort(test_idx_reorder)

    if dataset_str == 'citeseer':
        # Fix citeseer dataset (there are some isolated nodes in the graph)
        # Find isolated nodes, add them as zero-vecs into the right position
        test_idx_range_full = range(min(test_idx_reorder), max(test_idx_reorder)+1)
        tx_extended = sp.lil_matrix((len(test_idx_range_full), x.shape[1]))
        tx_extended[test_idx_range-min(test_idx_range), :] = tx
        tx = tx_extended
        ty_extended = np.zeros((len(test_idx_range_full), y.shape[1]))
        ty_extended[test_idx_range-min(test_idx_range), :] = ty
        ty = ty_extended

    features = sp.vstack((allx, tx)).tolil()
    features[test_idx_reorder, :] = features[test_idx_range, :]
    adj = nx.adjacency_matrix(nx.from_dict_of_lists(graph))

    labels = np.vstack((ally, ty))
    labels[test_idx_reorder, :] = labels[test_idx_range, :]

    adj_dense = np.array(adj.todense(), dtype=np.float64)
    attribute_dense = np.array(features.todense(), dtype=np.float64)
    cat_labels = np.array(np.argmax(labels, axis = 1).reshape(-1,1), dtype=np.uint8)

    return attribute_dense, adj_dense, cat_labels



parser = argparse.ArgumentParser()
parser.add_argument('--dataset', type=str, default='cora')  #'BlogCatalog'  'Flickr' 'cora'  'citeseer'  'pubmed' 
parser.add_argument('--seed', type=int, default=1)  #random seed
parser.add_argument('--m', type=int, default=15)  #num of fully connected nodes
parser.add_argument('--n', type=int)  
parser.add_argument('--k', type=int, default=50)  #num of clusters
args = parser.parse_args()


AD_dataset_list = ['BlogCatalog', 'Flickr']
Citation_dataset_list = ['cora', 'citeseer', 'pubmed']

# Set hyperparameters of disturbing
dataset_str = args.dataset  #'BlogCatalog'  'Flickr' 'cora'  'citeseer'  'pubmed'
seed = args.seed
m = args.m  #num of fully connected nodes  #10 15 20   5
k = args.k

if args.n is None:
	if dataset_str == 'cora' or dataset_str == 'citeseer':
	    n = 5
	elif dataset_str == 'BlogCatalog':
	    n = 10
	elif dataset_str == 'Flickr':
	    n = 15
	elif dataset_str == 'pubmed':
	    n = 20
else:
	n = args.n

if __name__ == "__main__":

    # Set seed
    print('Random seed: {:d}. \n'.format(seed))
    np.random.seed(seed)
    random.seed(seed)

    # Load data
    print('Loading data: {}...'.format(dataset_str))
    if dataset_str in AD_dataset_list:
        data = sio.loadmat('./raw_dataset/{}/{}.mat'.format(dataset_str, dataset_str))
        attribute_dense = np.array(data['Attributes'].todense())
        attribute_dense = preprocessing.no

标签:via,idx,Self,Contrastive,dataset,np,str,test,import
From: https://blog.csdn.net/Misnearch/article/details/139624723

相关文章

  • LISA: Reasoning Segmentation via Large Language Model
    Motivation&Abs现有的感知系统依赖人类的指示,难以主动推理以理解人类意图。新任务:reasoningsegmentation,模型需要根据给定的复杂/具有隐含意义的文本输出相应的segmask。新的benchmark:包含1000张左右图像的数据集(image-instruction-mask)。模型:LISA,既有LLM的语言生成能力......
  • 【YOLOv8改进】ACmix(Mixed Self-Attention and Convolution) (论文笔记+引入代码)
    YOLO目标检测创新改进与实战案例专栏专栏目录:YOLO有效改进系列及项目实战目录包含卷积,主干注意力,检测头等创新机制以及各种目标检测分割项目实战案例专栏链接:YOLO基础解析+创新改进+实战案例摘要卷积和自注意力是两个强大的表示学习技术,通常被认为是彼此独立的两......
  • WPF DataContext order and filter via CollectionViewSource.GetDefaultView(DataCon
    //xaml<Windowx:Class="WpfApp146.MainWindow"xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"xmlns:d="http://schemas.mi......
  • Sentiment Knowledge Enhanced Self-supervised Learning for Multimodal Sentiment A
    文章目录SKESL:多模态情感分析中的情感知识增强型自监督学习文章信息研究目的研究内容研究方法1.SentimentWordMasking2.Textrepresentationlearning3.Non-verbalinformationinjection(multimodalfusion)4.SentimentIntensityPrediction5.LossFunction6.Fine-tuni......
  • Meta最新路径搜索算法 Beyond A*: Better Planning with Transformers via Search Dyn
    这篇论文前两个月刚刚放出,研究了如何让人工智能(AI)更好地解决复杂的规划问题,比如在迷宫中寻找最短路径,或者推箱子游戏(Sokoban)中把箱子全部推到指定位置。传统上,这类问题通常使用专门的规划算法来解决,比如A*搜索算法。但是,训练AI模型(如Transformer)来解决这些问题......
  • C# NewtonJson Self referencing loop detected for property 'Parent' with type
    privatevoidImage_MouseLeftButtonDown(objectsender,MouseButtonEventArgse){stringimgJson1=JsonConvert.SerializeObject(img1);System.IO.File.AppendAllText($"{DateTime.Now.ToString("yyyyMMddHHmmssffff")}_img.json",imgJso......
  • WPF grid column resize via GridSpitter, when you can drag to enlarge or shrink t
    <Windowx:Class="WpfApp137.MainWindow"xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"xmlns:d="http://schemas.microsoft......
  • Scalable Membership Inference Attacks via Quantile Regression
    我们使用以下六个分类标准:动机:隐私问题:许多研究背后的主要动机是对机器学习模型相关的隐私风险日益增长的担忧。例如,Shokri等人(2017)和Carlini等人(2022)专注于开发和改进成员推理攻击,以评估模型对隐私泄露的脆弱性。模型理解:一些研究深入了解机器学习模型的固有属性。Y......
  • CLIP(Contrastive Language-Image Pre-training)
    CLIP(ContrastiveLanguage-ImagePre-training)是一种多模态预训练神经网络模型,由OpenAI在2021年初发布469。CLIP的核心创新在于其能够将图像和文本映射到一个共享的向量空间中,使得模型能够理解图像和文本之间的语义关系1。CLIP模型的架构非常简洁,但在zero-shot文本-图像检索、z......
  • BGP中next-hop-self 小实验
    next-hop-self在EBGP和IBGP边界使用,对ibgp下一跳邻居使用配置命令routerbgp1234neighbor2.2.2.2next-hop-self使用Next-hop-self原因EBGP的路由传进IBGP时,带的下一跳是EBGP的地址。在IBGP中传给下一跳的IBGP路由器时,携带的还是EBGP的地址。由于第二跳的IBGP没有EBGP......