Welcome to ROKSANA’s documentation!

Add your content using reStructuredText syntax. See the reStructuredText documentation for details.

Contents:

API Documentation

class roksana.Evaluator(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Bases: object

Evaluator class to assess the impact of attack methods on search strategies.

__init__(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Initialize the Evaluator.

Parameters:

search_method_before – Instance of SearchMethod before attack.
search_method_after – Instance of SearchMethod after attack.
k_values (List[int], optional) – List of k values for Hit@k and Recall@k. Defaults to [5, 10, 20].

evaluate(queries: List[int], gold_sets: List[List[int]], results_dir: str = 'results', filename: str = 'evaluation_results.csv') → None[source]

Perform evaluation on the given queries and save the results.

Parameters:

queries (List[int]) – List of query node indices.
gold_sets (List[List[int]]) – List of gold sets corresponding to each query.
results_dir (str, optional) – Directory to save the results file. Defaults to ‘results’.
filename (str, optional) – Name of the results file. Defaults to ‘evaluation_results.csv’.

get_all_results() → List[Dict[str, Any]][source]

Retrieve all evaluation results.

Returns:: List of evaluation result dictionaries.
Return type:: List[Dict[str, Any]]

class roksana.GATSearch(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Bases: SearchMethod

Search method using Graph Attention Networks (GAT).

__init__(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Initialize and train the GAT model.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
hidden_channels (int, optional) – Number of hidden channels in GAT layers.
heads (int, optional) – Number of attention heads in GAT layers.
epochs (int, optional) – Number of training epochs.
lr (float, optional) – Learning rate for the optimizer.

evaluate() → float[source]

Evaluate the model’s accuracy on the training set.

Returns:: Training accuracy.
Return type:: float

get_node_embeddings() → Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:: Node embeddings.
Return type:: torch.Tensor

search(query_features: Tensor, top_k: int = 10) → List[int][source]

Perform a search with the given query features using GAT embeddings.

Parameters:

query_features (torch.Tensor) – Feature vector of the query node.
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]: Train the GAT model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.GCNSearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using Graph Convolutional Networks (GCN).

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GCN model.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
hidden_channels (int, optional) – Number of hidden channels in GCN layers.
epochs (int, optional) – Number of training epochs.
lr (float, optional) – Learning rate for the optimizer.

evaluate() → float[source]

Evaluate the model’s accuracy on the training set.

Returns:: Training accuracy.
Return type:: float

get_node_embeddings() → Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:: Node embeddings.
Return type:: torch.Tensor

search(query_features: Tensor, top_k: int = 10) → List[List[int]][source]

Perform a search with the given query features using GCN embeddings.

Parameters:

query_features (torch.Tensor) – Feature tensor of the query nodes, shape [num_queries, feature_dim] or [feature_dim].
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of lists containing node indices sorted by similarity to each query.

Return type:

List[List[int]]

train_model()[source]: Train the GCN model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.SAGESearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using GraphSAGE.

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GraphSAGE model.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
hidden_channels (int, optional) – Number of hidden channels in SAGE layers.
epochs (int, optional) – Number of training epochs.
lr (float, optional) – Learning rate for the optimizer.

evaluate() → float[source]

Evaluate the model’s accuracy on the training set.

Returns:: Training accuracy.
Return type:: float

get_node_embeddings() → Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:: Node embeddings.
Return type:: torch.Tensor

search(query_features: Tensor, top_k: int = 10) → List[int][source]

Perform a search with the given query features using GraphSAGE embeddings.

Parameters:

query_features (torch.Tensor) – Feature vector of the query node.
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]: Train the GraphSAGE model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.SearchMethod(data: Any, device: str = None, **kwargs)[source]

Bases: ABC

Abstract base class for search methods.

abstract __init__(data: Any, device: str = None, **kwargs)[source]

Initialize the search method with the given dataset.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

abstract search(query_features: Any, top_k: int = 10) → List[int][source]

Perform a search with the given query features.

Parameters:

query_features (Any) – Feature vector of the query node.
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

class roksana.UserDataset(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Bases: InMemoryDataset

A dataset class for user-provided datasets adhering to PyG’s InMemoryDataset structure.

Users should provide their data in a specific format, typically as a list of torch_geometric.data.Data objects.

__init__(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Initialize the UserDataset.

Parameters:

root (str) – Root directory where the dataset should be saved.
transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.
data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. If provided, it will be used to initialize the dataset.

download()[source]: Users are expected to provide their own data, so no download is necessary.

process()[source]

Process the user-provided data and save it in the processed file.

Users can modify this method if they have specific processing requirements.

property processed_file_names: List[str]: The name of the processed file.

property raw_file_names: List[str]: Since users provide their own data, this can be left empty or used to list expected raw files.

roksana.demotion_value(before_attack_rank: int, after_attack_rank: int) → int[source]

Calculate the Demotion Value metric.

Parameters:

before_attack_rank (int) – The rank of the target node before the attack.
after_attack_rank (int) – The rank of the target node after the attack.

Returns:

Difference in rank (after_attack_rank - before_attack_rank).: A positive value indicates demotion.

Return type:

int

roksana.get_attack_method(name: str, data: Any, **kwargs) → BaseAttack[source]

Retrieve an instance of the specified attack method.

Parameters:

name (str) – Name of the attack method (e.g., ‘degree’, ‘pagerank’, ‘random’, ‘viking’).
data (Any) – The graph dataset.
**kwargs – Additional keyword arguments for initializing the attack method.

Returns:

An instance of the requested attack method.

Return type:

BaseAttack

Raises:

ValueError – If the specified attack method is not registered.

Example

>>> from roksana.attack_methods.registry import get_attack_method
>>> attack = get_attack_method('degree', data=my_graph, param1=value1)

roksana.get_dataset_info(dataset: InMemoryDataset) → Dict[str, Any][source]

Retrieve basic information about a dataset.

Parameters:: dataset (InMemoryDataset) – The dataset instance.
Returns:: A dictionary containing dataset information.
Return type:: Dict[str, Any]

roksana.get_search_method(name: str, data: Any, device: str = None, **kwargs) → SearchMethod[source]

Retrieve an instance of the specified search method.

Parameters:

name (str) – Name of the search method (e.g., ‘gcn’, ‘gat’, ‘sage’).
data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
**kwargs – Additional keyword arguments for the search method.

Returns:

An instance of the requested search method.

Return type:

SearchMethod

Raises:

ValueError – If the specified search method is not registered.

roksana.hit_at_k(retrieved: List[int], gold_set: List[int], k: int) → float[source]

Calculate Hit@k metric.

Parameters:

retrieved (List[int]) – List of retrieved node indices.
gold_set (List[int]) – List of gold node indices.
k (int) – The k in Hit@k.

Returns:

Hit@k value (1 if at least one gold node is in the top-k, else 0).

Return type:

float

roksana.list_available_standard_datasets() → List[str][source]

List all available standard datasets supported by ROKSANA.

Returns:: A list of supported dataset names.
Return type:: List[str]

roksana.load_dataset(dataset_name: str | None = None, root: str = 'data', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None) → InMemoryDataset[source]

Load a dataset, either a standard dataset or a user-provided dataset.

Parameters:

dataset_name (str, optional) – Name of the standard dataset to load (e.g., ‘cora’, ‘citeseer’). If None, a UserDataset should be provided via data_list.
root (str, optional) – Root directory where the dataset should be saved or loaded from. Defaults to ‘data’.
transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.
data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. Required if dataset_name is None.

Returns:

An instance of the loaded dataset.

Return type:

InMemoryDataset

roksana.load_standard_dataset(name: str, root: str = 'data') → Planetoid[source]

Load a standard dataset from PyG’s built-in datasets.

Supported datasets: ‘cora’, ‘citeseer’, ‘pubmed’, etc. Refer to PyG’s Planetoid datasets for more.

Parameters:

name (str) – Name of the dataset to load (e.g., ‘Cora’, ‘Citeseer’).
root (str, optional) – Root directory where the dataset should be saved. Defaults to ‘data’.

Returns:

An instance of the Planetoid dataset.

Return type:

Planetoid

roksana.load_user_dataset_from_files(data_dir: str, file_format: str = 'json', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None) → UserDataset[source]

Load a user dataset from files in a specified directory.

Supported file formats: ‘json’, ‘csv’, ‘pickle’.

Parameters:

data_dir (str) – Directory containing the dataset files.
file_format (str, optional) – Format of the dataset files. Defaults to ‘json’.
transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

Returns:

An instance of the UserDataset loaded from the files.

Return type:

UserDataset

roksana.prepare_search_set(data: Data, percentage: float = 0.1, seed: int = 42) → Tuple[List[int], List[List[int]]][source]

Prepare a search set for search evaluation by selecting a percentage of nodes as queries and creating corresponding gold sets based on feature similarity.

Parameters:

data (Data) – The graph dataset.
percentage (float, optional) – Percentage of nodes to select as queries. Must be between 0 and 1. Defaults to 0.1 (10%).
seed (int, optional) – Seed for random number generator to ensure reproducibility. Defaults to 42.

Returns:

A tuple containing:

queries (List[int]): List of node indices selected as queries.
gold_sets (List[List[int]]): List of gold sets, where each gold set is a list of node indices
with the same features as the corresponding query.

Return type:

Tuple[List[int], List[List[int]]]

Raises:

ValueError – If percentage is not between 0 and 1.
AttributeError – If dataset does not contain node features (data.x).

roksana.recall_at_k(retrieved: List[int], gold_set: List[int], k: int) → float[source]

Calculate Recall@k metric.

Parameters:

retrieved (List[int]) – List of retrieved node indices.
gold_set (List[int]) – List of gold node indices.
k (int) – The k in Recall@k.

Returns:

Recall@k value.

Return type:

float

class roksana.datasets.UserDataset(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Bases: InMemoryDataset

A dataset class for user-provided datasets adhering to PyG’s InMemoryDataset structure.

Users should provide their data in a specific format, typically as a list of torch_geometric.data.Data objects.

__init__(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Initialize the UserDataset.

Parameters:

root (str) – Root directory where the dataset should be saved.
transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.
data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. If provided, it will be used to initialize the dataset.

download()[source]: Users are expected to provide their own data, so no download is necessary.

process()[source]

Process the user-provided data and save it in the processed file.

Users can modify this method if they have specific processing requirements.

property processed_file_names: List[str]: The name of the processed file.

property raw_file_names: List[str]: Since users provide their own data, this can be left empty or used to list expected raw files.

roksana.datasets.get_dataset_info(dataset: InMemoryDataset) → Dict[str, Any][source]

Retrieve basic information about a dataset.

Parameters:: dataset (InMemoryDataset) – The dataset instance.
Returns:: A dictionary containing dataset information.
Return type:: Dict[str, Any]

roksana.datasets.list_available_standard_datasets() → List[str][source]

List all available standard datasets supported by ROKSANA.

Returns:: A list of supported dataset names.
Return type:: List[str]

roksana.datasets.load_dataset(dataset_name: str | None = None, root: str = 'data', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None) → InMemoryDataset[source]

Load a dataset, either a standard dataset or a user-provided dataset.

Parameters:

dataset_name (str, optional) – Name of the standard dataset to load (e.g., ‘cora’, ‘citeseer’). If None, a UserDataset should be provided via data_list.
root (str, optional) – Root directory where the dataset should be saved or loaded from. Defaults to ‘data’.
transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.
data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. Required if dataset_name is None.

Returns:

An instance of the loaded dataset.

Return type:

InMemoryDataset

roksana.datasets.load_standard_dataset(name: str, root: str = 'data') → Planetoid[source]

Load a standard dataset from PyG’s built-in datasets.

Supported datasets: ‘cora’, ‘citeseer’, ‘pubmed’, etc. Refer to PyG’s Planetoid datasets for more.

Parameters:

name (str) – Name of the dataset to load (e.g., ‘Cora’, ‘Citeseer’).
root (str, optional) – Root directory where the dataset should be saved. Defaults to ‘data’.

Returns:

An instance of the Planetoid dataset.

Return type:

Planetoid

roksana.datasets.load_user_dataset_from_files(data_dir: str, file_format: str = 'json', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None) → UserDataset[source]

Load a user dataset from files in a specified directory.

Supported file formats: ‘json’, ‘csv’, ‘pickle’.

Parameters:

data_dir (str) – Directory containing the dataset files.
file_format (str, optional) – Format of the dataset files. Defaults to ‘json’.
transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

Returns:

An instance of the UserDataset loaded from the files.

Return type:

UserDataset

roksana.datasets.prepare_search_set(data: Data, percentage: float = 0.1, seed: int = 42) → Tuple[List[int], List[List[int]]][source]

Prepare a search set for search evaluation by selecting a percentage of nodes as queries and creating corresponding gold sets based on feature similarity.

Parameters:

data (Data) – The graph dataset.
percentage (float, optional) – Percentage of nodes to select as queries. Must be between 0 and 1. Defaults to 0.1 (10%).
seed (int, optional) – Seed for random number generator to ensure reproducibility. Defaults to 42.

Returns:

A tuple containing:

queries (List[int]): List of node indices selected as queries.
gold_sets (List[List[int]]): List of gold sets, where each gold set is a list of node indices
with the same features as the corresponding query.

Return type:

Tuple[List[int], List[List[int]]]

Raises:

ValueError – If percentage is not between 0 and 1.
AttributeError – If dataset does not contain node features (data.x).

class roksana.search_methods.GATSearch(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Bases: SearchMethod

Search method using Graph Attention Networks (GAT).

__init__(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Initialize and train the GAT model.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
hidden_channels (int, optional) – Number of hidden channels in GAT layers.
heads (int, optional) – Number of attention heads in GAT layers.
epochs (int, optional) – Number of training epochs.
lr (float, optional) – Learning rate for the optimizer.

evaluate() → float[source]

Evaluate the model’s accuracy on the training set.

Returns:: Training accuracy.
Return type:: float

get_node_embeddings() → Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:: Node embeddings.
Return type:: torch.Tensor

search(query_features: Tensor, top_k: int = 10) → List[int][source]

Perform a search with the given query features using GAT embeddings.

Parameters:

query_features (torch.Tensor) – Feature vector of the query node.
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]: Train the GAT model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.search_methods.GCNSearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using Graph Convolutional Networks (GCN).

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GCN model.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
hidden_channels (int, optional) – Number of hidden channels in GCN layers.
epochs (int, optional) – Number of training epochs.
lr (float, optional) – Learning rate for the optimizer.

evaluate() → float[source]

Evaluate the model’s accuracy on the training set.

Returns:: Training accuracy.
Return type:: float

get_node_embeddings() → Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:: Node embeddings.
Return type:: torch.Tensor

search(query_features: Tensor, top_k: int = 10) → List[List[int]][source]

Perform a search with the given query features using GCN embeddings.

Parameters:

query_features (torch.Tensor) – Feature tensor of the query nodes, shape [num_queries, feature_dim] or [feature_dim].
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of lists containing node indices sorted by similarity to each query.

Return type:

List[List[int]]

train_model()[source]: Train the GCN model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.search_methods.SAGESearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using GraphSAGE.

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GraphSAGE model.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
hidden_channels (int, optional) – Number of hidden channels in SAGE layers.
epochs (int, optional) – Number of training epochs.
lr (float, optional) – Learning rate for the optimizer.

evaluate() → float[source]

Evaluate the model’s accuracy on the training set.

Returns:: Training accuracy.
Return type:: float

get_node_embeddings() → Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:: Node embeddings.
Return type:: torch.Tensor

search(query_features: Tensor, top_k: int = 10) → List[int][source]

Perform a search with the given query features using GraphSAGE embeddings.

Parameters:

query_features (torch.Tensor) – Feature vector of the query node.
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]: Train the GraphSAGE model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.search_methods.SearchMethod(data: Any, device: str = None, **kwargs)[source]

Bases: ABC

Abstract base class for search methods.

abstract __init__(data: Any, device: str = None, **kwargs)[source]

Initialize the search method with the given dataset.

Parameters:

data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

abstract search(query_features: Any, top_k: int = 10) → List[int][source]

Perform a search with the given query features.

Parameters:

query_features (Any) – Feature vector of the query node.
top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

roksana.search_methods.get_search_method(name: str, data: Any, device: str = None, **kwargs) → SearchMethod[source]

Retrieve an instance of the specified search method.

Parameters:

name (str) – Name of the search method (e.g., ‘gcn’, ‘gat’, ‘sage’).
data (Any) – The graph dataset.
device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).
**kwargs – Additional keyword arguments for the search method.

Returns:

An instance of the requested search method.

Return type:

SearchMethod

Raises:

ValueError – If the specified search method is not registered.

Attack Methods Package

This package provides various attack methods for adversarial modifications of graph datasets. It includes both predefined attack methods and utilities for registering and retrieving custom attacks.

Modules:

base_attack: Defines the abstract base class for all attack methods.
registry: Manages registration and retrieval of attack methods.
degree: Implements degree-based edge removal.
pagerank: Implements PageRank-based edge removal.
random: Implements random edge removal.
viking: Implements Viking perturbation attack.

- BaseAttack: Abstract base class for all attack methods.

- get_attack_method: Retrieve a registered attack method by name.

- ATTACK_METHODS: Dictionary of registered attack methods.

- DegreeAttack: Implements degree-based attack logic.

- PageRankAttack: Implements PageRank-based attack logic.

- RandomAttack: Implements random attack logic.

- VikingAttack: Implements Viking perturbation attack logic.

class roksana.attack_methods.BaseAttack(data: Any, **kwargs)[source]

Bases: ABC

Abstract BaseAttack Class

Defines the interface for attack methods. All attack methods must inherit from this class and implement the __init__ and attack methods.

data

The graph dataset used by the attack method.

Type:: Any

params

Additional parameters for the attack method.

Type:: dict

abstract __init__(data: Any, **kwargs)[source]

Initialize the attack method with the given dataset.

Parameters:

data (Any) – The graph dataset.
**kwargs – Additional keyword arguments specific to the attack method.

abstract attack(query_node: int, perturbations: int = 1) → Dict[str, Any][source]

Perform an attack on the specified query node.

Parameters:

query_node (int) – Index of the node to attack.
perturbations (int, optional) – Number of perturbations to apply. Defaults to 1.

Returns:

A dictionary containing details of the attack, such as removed edges or modifications made.

Return type:

Dict[str, Any]

class roksana.attack_methods.DegreeAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

DegreeAttack Class

Implements an adversarial attack that removes edges based on the degree of connected nodes.

data

The graph dataset.

Type:: Any

params

Additional parameters for the attack.

Type:: dict

__init__(data: Any, **kwargs)[source]

Initialize the DegreeAttack method.

Parameters:

data (Any) – The graph dataset.
**kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) → Tuple[Any, List[Tuple[int, int]]][source]

Perform the degree-based attack on the graph dataset.

Parameters:

data (Any) – The graph dataset.
selected_nodes (torch.Tensor) – Nodes to target for edge removal. Must be a 1D tensor.

Returns:

updated_data (Any): The modified graph dataset with updated edges.
removed_edges (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

class roksana.attack_methods.PageRankAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

PageRankAttack Class

Implements an adversarial attack that removes edges connected to nodes based on their PageRank scores.

data

The graph dataset.

Type:: Any

params

Additional parameters for the attack.

Type:: dict

__init__(data: Any, **kwargs)[source]

Initialize the PageRankAttack method.

Parameters:

data (Any) – The graph dataset.
**kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) → Tuple[Any, List[Tuple[int, int]]][source]

Perform the PageRank-based attack on the graph dataset.

Parameters:

data (Any) – The graph dataset.
selected_nodes (torch.Tensor) – Nodes to target for edge removal. Must be a 1D tensor.

Returns:

updated_data (Any): The modified graph dataset with updated edges.
removed_edges (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

class roksana.attack_methods.RandomAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

RandomAttack Class

Implements an adversarial attack that randomly removes edges connected to specified nodes in a graph.

data

The graph dataset.

Type:: Any

params

Additional parameters for the attack.

Type:: dict

__init__(data: Any, **kwargs)[source]

Initialize the RandomAttack method.

Parameters:

data (Any) – The graph dataset.
**kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) → Tuple[Any, List[Tuple[int, int]]][source]

Perform the random edge removal attack.

Parameters:

data (Any) – The graph dataset.
selected_nodes (torch.Tensor) – Nodes for which edges are to be removed. Can be a single node or a tensor of nodes.

Returns:

The modified graph dataset with updated edges.
A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

class roksana.attack_methods.VikingAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

VikingAttack Class

Implements an adversarial attack by perturbing edges involving specified nodes in a graph.

data

The graph dataset.

Type:: Any

params

Additional parameters for the attack.

Type:: dict

__init__(data: Any, **kwargs)[source]

Initialize the VikingAttack method.

Parameters:

data (Any) – The graph dataset.
**kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) → Tuple[Any, List[Tuple[int, int]]][source]

Execute the Viking perturbation attack.

Parameters:

data (Any) – The graph dataset.
selected_nodes (torch.Tensor) – Nodes to target for edge removal.

Returns:

updated_data (Any): The modified graph dataset with updated edges.
removed_edges (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

perturbation_attack(data: Any, selected_nodes: Tensor) → Tuple[Tensor, List[Tuple[int, int]]][source]

Perform the Viking perturbation attack by removing edges involving selected nodes.

Parameters:

data (Any) – The graph dataset.
selected_nodes (torch.Tensor) – Nodes to target for edge removal.

Returns:

retained_edges (torch.Tensor): The edge index after removal of edges.
edges_to_remove (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[torch.Tensor, List[Tuple[int, int]]]

roksana.attack_methods.get_attack_method(name: str, data: Any, **kwargs) → BaseAttack[source]

Retrieve an instance of the specified attack method.

Parameters:

name (str) – Name of the attack method (e.g., ‘degree’, ‘pagerank’, ‘random’, ‘viking’).
data (Any) – The graph dataset.
**kwargs – Additional keyword arguments for initializing the attack method.

Returns:

An instance of the requested attack method.

Return type:

BaseAttack

Raises:

ValueError – If the specified attack method is not registered.

Example

>>> from roksana.attack_methods.registry import get_attack_method
>>> attack = get_attack_method('degree', data=my_graph, param1=value1)

class roksana.evaluation.Evaluator(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Bases: object

Evaluator class to assess the impact of attack methods on search strategies.

__init__(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Initialize the Evaluator.

Parameters:

search_method_before – Instance of SearchMethod before attack.
search_method_after – Instance of SearchMethod after attack.
k_values (List[int], optional) – List of k values for Hit@k and Recall@k. Defaults to [5, 10, 20].

evaluate(queries: List[int], gold_sets: List[List[int]], results_dir: str = 'results', filename: str = 'evaluation_results.csv') → None[source]

Perform evaluation on the given queries and save the results.

Parameters:

queries (List[int]) – List of query node indices.
gold_sets (List[List[int]]) – List of gold sets corresponding to each query.
results_dir (str, optional) – Directory to save the results file. Defaults to ‘results’.
filename (str, optional) – Name of the results file. Defaults to ‘evaluation_results.csv’.

get_all_results() → List[Dict[str, Any]][source]

Retrieve all evaluation results.

Returns:: List of evaluation result dictionaries.
Return type:: List[Dict[str, Any]]

roksana.evaluation.demotion_value(before_attack_rank: int, after_attack_rank: int) → int[source]

Calculate the Demotion Value metric.

Parameters:

before_attack_rank (int) – The rank of the target node before the attack.
after_attack_rank (int) – The rank of the target node after the attack.

Returns:

Difference in rank (after_attack_rank - before_attack_rank).: A positive value indicates demotion.

Return type:

int

roksana.evaluation.hit_at_k(retrieved: List[int], gold_set: List[int], k: int) → float[source]

Calculate Hit@k metric.

Parameters:

retrieved (List[int]) – List of retrieved node indices.
gold_set (List[int]) – List of gold node indices.
k (int) – The k in Hit@k.

Returns:

Hit@k value (1 if at least one gold node is in the top-k, else 0).

Return type:

float

roksana.evaluation.recall_at_k(retrieved: List[int], gold_set: List[int], k: int) → float[source]

Calculate Recall@k metric.

Parameters:

retrieved (List[int]) – List of retrieved node indices.
gold_set (List[int]) – List of gold node indices.
k (int) – The k in Recall@k.

Returns:

Recall@k value.

Return type:

float

roksana.evaluation.save_results_to_csv(results: List[Dict[str, Any]], filepath: str) → None[source]

Save evaluation results to a CSV file.

Parameters:

results (List[Dict[str, Any]]) – List of evaluation result dictionaries.
filepath (str) – Path to the CSV file where results will be saved.

roksana.evaluation.save_results_to_json(results: List[Dict[str, Any]], filepath: str) → None[source]

Save evaluation results to a JSON file.

Parameters:

results (List[Dict[str, Any]]) – List of evaluation result dictionaries.
filepath (str) – Path to the JSON file where results will be saved.

roksana.evaluation.save_results_to_pickle(results: List[Dict[str, Any]], filepath: str) → None[source]

Save evaluation results to a Pickle file.

Parameters:

results (List[Dict[str, Any]]) – List of evaluation result dictionaries.
filepath (str) – Path to the Pickle file where results will be saved.