Welcome to ROKSANA’s documentation!

Add your content using reStructuredText syntax. See the reStructuredText documentation for details.

API Documentation

class roksana.Evaluator(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Bases: object

Evaluator class to assess the impact of attack methods on search strategies.

__init__(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Initialize the Evaluator.

Parameters:
  • search_method_before – Instance of SearchMethod before attack.

  • search_method_after – Instance of SearchMethod after attack.

  • k_values (List[int], optional) – List of k values for Hit@k and Recall@k. Defaults to [5, 10, 20].

evaluate(queries: List[int], gold_sets: List[List[int]], results_dir: str = 'results', filename: str = 'evaluation_results.csv') None[source]

Perform evaluation on the given queries and save the results.

Parameters:
  • queries (List[int]) – List of query node indices.

  • gold_sets (List[List[int]]) – List of gold sets corresponding to each query.

  • results_dir (str, optional) – Directory to save the results file. Defaults to ‘results’.

  • filename (str, optional) – Name of the results file. Defaults to ‘evaluation_results.csv’.

get_all_results() List[Dict[str, Any]][source]

Retrieve all evaluation results.

Returns:

List of evaluation result dictionaries.

Return type:

List[Dict[str, Any]]

class roksana.GATSearch(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Bases: SearchMethod

Search method using Graph Attention Networks (GAT).

__init__(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Initialize and train the GAT model.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • hidden_channels (int, optional) – Number of hidden channels in GAT layers.

  • heads (int, optional) – Number of attention heads in GAT layers.

  • epochs (int, optional) – Number of training epochs.

  • lr (float, optional) – Learning rate for the optimizer.

evaluate() float[source]

Evaluate the model’s accuracy on the training set.

Returns:

Training accuracy.

Return type:

float

get_node_embeddings() Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:

Node embeddings.

Return type:

torch.Tensor

search(query_features: Tensor, top_k: int = 10) List[int][source]

Perform a search with the given query features using GAT embeddings.

Parameters:
  • query_features (torch.Tensor) – Feature vector of the query node.

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]

Train the GAT model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.GCNSearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using Graph Convolutional Networks (GCN).

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GCN model.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • hidden_channels (int, optional) – Number of hidden channels in GCN layers.

  • epochs (int, optional) – Number of training epochs.

  • lr (float, optional) – Learning rate for the optimizer.

evaluate() float[source]

Evaluate the model’s accuracy on the training set.

Returns:

Training accuracy.

Return type:

float

get_node_embeddings() Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:

Node embeddings.

Return type:

torch.Tensor

search(query_features: Tensor, top_k: int = 10) List[List[int]][source]

Perform a search with the given query features using GCN embeddings.

Parameters:
  • query_features (torch.Tensor) – Feature tensor of the query nodes, shape [num_queries, feature_dim] or [feature_dim].

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of lists containing node indices sorted by similarity to each query.

Return type:

List[List[int]]

train_model()[source]

Train the GCN model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.SAGESearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using GraphSAGE.

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GraphSAGE model.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • hidden_channels (int, optional) – Number of hidden channels in SAGE layers.

  • epochs (int, optional) – Number of training epochs.

  • lr (float, optional) – Learning rate for the optimizer.

evaluate() float[source]

Evaluate the model’s accuracy on the training set.

Returns:

Training accuracy.

Return type:

float

get_node_embeddings() Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:

Node embeddings.

Return type:

torch.Tensor

search(query_features: Tensor, top_k: int = 10) List[int][source]

Perform a search with the given query features using GraphSAGE embeddings.

Parameters:
  • query_features (torch.Tensor) – Feature vector of the query node.

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]

Train the GraphSAGE model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.SearchMethod(data: Any, device: str = None, **kwargs)[source]

Bases: ABC

Abstract base class for search methods.

abstract __init__(data: Any, device: str = None, **kwargs)[source]

Initialize the search method with the given dataset.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

abstract search(query_features: Any, top_k: int = 10) List[int][source]

Perform a search with the given query features.

Parameters:
  • query_features (Any) – Feature vector of the query node.

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

class roksana.UserDataset(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Bases: InMemoryDataset

A dataset class for user-provided datasets adhering to PyG’s InMemoryDataset structure.

Users should provide their data in a specific format, typically as a list of torch_geometric.data.Data objects.

__init__(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Initialize the UserDataset.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

  • data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. If provided, it will be used to initialize the dataset.

download()[source]

Users are expected to provide their own data, so no download is necessary.

process()[source]

Process the user-provided data and save it in the processed file.

Users can modify this method if they have specific processing requirements.

property processed_file_names: List[str]

The name of the processed file.

property raw_file_names: List[str]

Since users provide their own data, this can be left empty or used to list expected raw files.

roksana.demotion_value(before_attack_rank: int, after_attack_rank: int) int[source]

Calculate the Demotion Value metric.

Parameters:
  • before_attack_rank (int) – The rank of the target node before the attack.

  • after_attack_rank (int) – The rank of the target node after the attack.

Returns:

Difference in rank (after_attack_rank - before_attack_rank).

A positive value indicates demotion.

Return type:

int

roksana.get_attack_method(name: str, data: Any, **kwargs) BaseAttack[source]

Retrieve an instance of the specified attack method.

Parameters:
  • name (str) – Name of the attack method (e.g., ‘degree’, ‘pagerank’, ‘random’, ‘viking’).

  • data (Any) – The graph dataset.

  • **kwargs – Additional keyword arguments for initializing the attack method.

Returns:

An instance of the requested attack method.

Return type:

BaseAttack

Raises:

ValueError – If the specified attack method is not registered.

Example

>>> from roksana.attack_methods.registry import get_attack_method
>>> attack = get_attack_method('degree', data=my_graph, param1=value1)
roksana.get_dataset_info(dataset: InMemoryDataset) Dict[str, Any][source]

Retrieve basic information about a dataset.

Parameters:

dataset (InMemoryDataset) – The dataset instance.

Returns:

A dictionary containing dataset information.

Return type:

Dict[str, Any]

roksana.get_search_method(name: str, data: Any, device: str = None, **kwargs) SearchMethod[source]

Retrieve an instance of the specified search method.

Parameters:
  • name (str) – Name of the search method (e.g., ‘gcn’, ‘gat’, ‘sage’).

  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • **kwargs – Additional keyword arguments for the search method.

Returns:

An instance of the requested search method.

Return type:

SearchMethod

Raises:

ValueError – If the specified search method is not registered.

roksana.hit_at_k(retrieved: List[int], gold_set: List[int], k: int) float[source]

Calculate Hit@k metric.

Parameters:
  • retrieved (List[int]) – List of retrieved node indices.

  • gold_set (List[int]) – List of gold node indices.

  • k (int) – The k in Hit@k.

Returns:

Hit@k value (1 if at least one gold node is in the top-k, else 0).

Return type:

float

roksana.list_available_standard_datasets() List[str][source]

List all available standard datasets supported by ROKSANA.

Returns:

A list of supported dataset names.

Return type:

List[str]

roksana.load_dataset(dataset_name: str | None = None, root: str = 'data', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None) InMemoryDataset[source]

Load a dataset, either a standard dataset or a user-provided dataset.

Parameters:
  • dataset_name (str, optional) – Name of the standard dataset to load (e.g., ‘cora’, ‘citeseer’). If None, a UserDataset should be provided via data_list.

  • root (str, optional) – Root directory where the dataset should be saved or loaded from. Defaults to ‘data’.

  • transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

  • data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. Required if dataset_name is None.

Returns:

An instance of the loaded dataset.

Return type:

InMemoryDataset

roksana.load_standard_dataset(name: str, root: str = 'data') Planetoid[source]

Load a standard dataset from PyG’s built-in datasets.

Supported datasets: ‘cora’, ‘citeseer’, ‘pubmed’, etc. Refer to PyG’s Planetoid datasets for more.

Parameters:
  • name (str) – Name of the dataset to load (e.g., ‘Cora’, ‘Citeseer’).

  • root (str, optional) – Root directory where the dataset should be saved. Defaults to ‘data’.

Returns:

An instance of the Planetoid dataset.

Return type:

Planetoid

roksana.load_user_dataset_from_files(data_dir: str, file_format: str = 'json', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None) UserDataset[source]

Load a user dataset from files in a specified directory.

Supported file formats: ‘json’, ‘csv’, ‘pickle’.

Parameters:
  • data_dir (str) – Directory containing the dataset files.

  • file_format (str, optional) – Format of the dataset files. Defaults to ‘json’.

  • transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

Returns:

An instance of the UserDataset loaded from the files.

Return type:

UserDataset

roksana.prepare_search_set(data: Data, percentage: float = 0.1, seed: int = 42) Tuple[List[int], List[List[int]]][source]

Prepare a search set for search evaluation by selecting a percentage of nodes as queries and creating corresponding gold sets based on feature similarity.

Parameters:
  • data (Data) – The graph dataset.

  • percentage (float, optional) – Percentage of nodes to select as queries. Must be between 0 and 1. Defaults to 0.1 (10%).

  • seed (int, optional) – Seed for random number generator to ensure reproducibility. Defaults to 42.

Returns:

A tuple containing:
  • queries (List[int]): List of node indices selected as queries.

  • gold_sets (List[List[int]]): List of gold sets, where each gold set is a list of node indices

    with the same features as the corresponding query.

Return type:

Tuple[List[int], List[List[int]]]

Raises:
  • ValueError – If percentage is not between 0 and 1.

  • AttributeError – If dataset does not contain node features (data.x).

roksana.recall_at_k(retrieved: List[int], gold_set: List[int], k: int) float[source]

Calculate Recall@k metric.

Parameters:
  • retrieved (List[int]) – List of retrieved node indices.

  • gold_set (List[int]) – List of gold node indices.

  • k (int) – The k in Recall@k.

Returns:

Recall@k value.

Return type:

float

class roksana.datasets.UserDataset(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Bases: InMemoryDataset

A dataset class for user-provided datasets adhering to PyG’s InMemoryDataset structure.

Users should provide their data in a specific format, typically as a list of torch_geometric.data.Data objects.

__init__(root: str, transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None)[source]

Initialize the UserDataset.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

  • data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. If provided, it will be used to initialize the dataset.

download()[source]

Users are expected to provide their own data, so no download is necessary.

process()[source]

Process the user-provided data and save it in the processed file.

Users can modify this method if they have specific processing requirements.

property processed_file_names: List[str]

The name of the processed file.

property raw_file_names: List[str]

Since users provide their own data, this can be left empty or used to list expected raw files.

roksana.datasets.get_dataset_info(dataset: InMemoryDataset) Dict[str, Any][source]

Retrieve basic information about a dataset.

Parameters:

dataset (InMemoryDataset) – The dataset instance.

Returns:

A dictionary containing dataset information.

Return type:

Dict[str, Any]

roksana.datasets.list_available_standard_datasets() List[str][source]

List all available standard datasets supported by ROKSANA.

Returns:

A list of supported dataset names.

Return type:

List[str]

roksana.datasets.load_dataset(dataset_name: str | None = None, root: str = 'data', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None, data_list: List[Data] | None = None) InMemoryDataset[source]

Load a dataset, either a standard dataset or a user-provided dataset.

Parameters:
  • dataset_name (str, optional) – Name of the standard dataset to load (e.g., ‘cora’, ‘citeseer’). If None, a UserDataset should be provided via data_list.

  • root (str, optional) – Root directory where the dataset should be saved or loaded from. Defaults to ‘data’.

  • transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

  • data_list (List[Data], optional) – A list of torch_geometric.data.Data objects. Required if dataset_name is None.

Returns:

An instance of the loaded dataset.

Return type:

InMemoryDataset

roksana.datasets.load_standard_dataset(name: str, root: str = 'data') Planetoid[source]

Load a standard dataset from PyG’s built-in datasets.

Supported datasets: ‘cora’, ‘citeseer’, ‘pubmed’, etc. Refer to PyG’s Planetoid datasets for more.

Parameters:
  • name (str) – Name of the dataset to load (e.g., ‘Cora’, ‘Citeseer’).

  • root (str, optional) – Root directory where the dataset should be saved. Defaults to ‘data’.

Returns:

An instance of the Planetoid dataset.

Return type:

Planetoid

roksana.datasets.load_user_dataset_from_files(data_dir: str, file_format: str = 'json', transform: Callable | None = None, pre_transform: Callable | None = None, pre_filter: Callable | None = None) UserDataset[source]

Load a user dataset from files in a specified directory.

Supported file formats: ‘json’, ‘csv’, ‘pickle’.

Parameters:
  • data_dir (str) – Directory containing the dataset files.

  • file_format (str, optional) – Format of the dataset files. Defaults to ‘json’.

  • transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • pre_transform (Callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • pre_filter (Callable, optional) – A function that takes in a torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

Returns:

An instance of the UserDataset loaded from the files.

Return type:

UserDataset

roksana.datasets.prepare_search_set(data: Data, percentage: float = 0.1, seed: int = 42) Tuple[List[int], List[List[int]]][source]

Prepare a search set for search evaluation by selecting a percentage of nodes as queries and creating corresponding gold sets based on feature similarity.

Parameters:
  • data (Data) – The graph dataset.

  • percentage (float, optional) – Percentage of nodes to select as queries. Must be between 0 and 1. Defaults to 0.1 (10%).

  • seed (int, optional) – Seed for random number generator to ensure reproducibility. Defaults to 42.

Returns:

A tuple containing:
  • queries (List[int]): List of node indices selected as queries.

  • gold_sets (List[List[int]]): List of gold sets, where each gold set is a list of node indices

    with the same features as the corresponding query.

Return type:

Tuple[List[int], List[List[int]]]

Raises:
  • ValueError – If percentage is not between 0 and 1.

  • AttributeError – If dataset does not contain node features (data.x).

class roksana.search_methods.GATSearch(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Bases: SearchMethod

Search method using Graph Attention Networks (GAT).

__init__(data: Any, device: str = None, hidden_channels: int = 64, heads: int = 8, epochs: int = 200, lr: float = 0.005)[source]

Initialize and train the GAT model.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • hidden_channels (int, optional) – Number of hidden channels in GAT layers.

  • heads (int, optional) – Number of attention heads in GAT layers.

  • epochs (int, optional) – Number of training epochs.

  • lr (float, optional) – Learning rate for the optimizer.

evaluate() float[source]

Evaluate the model’s accuracy on the training set.

Returns:

Training accuracy.

Return type:

float

get_node_embeddings() Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:

Node embeddings.

Return type:

torch.Tensor

search(query_features: Tensor, top_k: int = 10) List[int][source]

Perform a search with the given query features using GAT embeddings.

Parameters:
  • query_features (torch.Tensor) – Feature vector of the query node.

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]

Train the GAT model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.search_methods.GCNSearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using Graph Convolutional Networks (GCN).

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GCN model.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • hidden_channels (int, optional) – Number of hidden channels in GCN layers.

  • epochs (int, optional) – Number of training epochs.

  • lr (float, optional) – Learning rate for the optimizer.

evaluate() float[source]

Evaluate the model’s accuracy on the training set.

Returns:

Training accuracy.

Return type:

float

get_node_embeddings() Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:

Node embeddings.

Return type:

torch.Tensor

search(query_features: Tensor, top_k: int = 10) List[List[int]][source]

Perform a search with the given query features using GCN embeddings.

Parameters:
  • query_features (torch.Tensor) – Feature tensor of the query nodes, shape [num_queries, feature_dim] or [feature_dim].

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of lists containing node indices sorted by similarity to each query.

Return type:

List[List[int]]

train_model()[source]

Train the GCN model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.search_methods.SAGESearch(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Bases: SearchMethod

Search method using GraphSAGE.

__init__(data: Any, device: str = None, hidden_channels: int = 64, epochs: int = 200, lr: float = 0.01)[source]

Initialize and train the GraphSAGE model.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • hidden_channels (int, optional) – Number of hidden channels in SAGE layers.

  • epochs (int, optional) – Number of training epochs.

  • lr (float, optional) – Learning rate for the optimizer.

evaluate() float[source]

Evaluate the model’s accuracy on the training set.

Returns:

Training accuracy.

Return type:

float

get_node_embeddings() Tensor[source]

Generate node embeddings by passing the data through the model.

Returns:

Node embeddings.

Return type:

torch.Tensor

search(query_features: Tensor, top_k: int = 10) List[int][source]

Perform a search with the given query features using GraphSAGE embeddings.

Parameters:
  • query_features (torch.Tensor) – Feature vector of the query node.

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

train_model()[source]

Train the GraphSAGE model on the dataset. Assumes that the dataset has a ‘y’ attribute for node labels.

class roksana.search_methods.SearchMethod(data: Any, device: str = None, **kwargs)[source]

Bases: ABC

Abstract base class for search methods.

abstract __init__(data: Any, device: str = None, **kwargs)[source]

Initialize the search method with the given dataset.

Parameters:
  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

abstract search(query_features: Any, top_k: int = 10) List[int][source]

Perform a search with the given query features.

Parameters:
  • query_features (Any) – Feature vector of the query node.

  • top_k (int, optional) – Number of top similar nodes to retrieve.

Returns:

List of node indices sorted by similarity to the query.

Return type:

List[int]

roksana.search_methods.get_search_method(name: str, data: Any, device: str = None, **kwargs) SearchMethod[source]

Retrieve an instance of the specified search method.

Parameters:
  • name (str) – Name of the search method (e.g., ‘gcn’, ‘gat’, ‘sage’).

  • data (Any) – The graph dataset.

  • device (str, optional) – Device to run the computations on (‘cpu’ or ‘cuda’).

  • **kwargs – Additional keyword arguments for the search method.

Returns:

An instance of the requested search method.

Return type:

SearchMethod

Raises:

ValueError – If the specified search method is not registered.

Attack Methods Package

This package provides various attack methods for adversarial modifications of graph datasets. It includes both predefined attack methods and utilities for registering and retrieving custom attacks.

Modules:
  • base_attack: Defines the abstract base class for all attack methods.

  • registry: Manages registration and retrieval of attack methods.

  • degree: Implements degree-based edge removal.

  • pagerank: Implements PageRank-based edge removal.

  • random: Implements random edge removal.

  • viking: Implements Viking perturbation attack.

- BaseAttack

Abstract base class for all attack methods.

- get_attack_method

Retrieve a registered attack method by name.

- ATTACK_METHODS

Dictionary of registered attack methods.

- DegreeAttack

Implements degree-based attack logic.

- PageRankAttack

Implements PageRank-based attack logic.

- RandomAttack

Implements random attack logic.

- VikingAttack

Implements Viking perturbation attack logic.

class roksana.attack_methods.BaseAttack(data: Any, **kwargs)[source]

Bases: ABC

Abstract BaseAttack Class

Defines the interface for attack methods. All attack methods must inherit from this class and implement the __init__ and attack methods.

data

The graph dataset used by the attack method.

Type:

Any

params

Additional parameters for the attack method.

Type:

dict

abstract __init__(data: Any, **kwargs)[source]

Initialize the attack method with the given dataset.

Parameters:
  • data (Any) – The graph dataset.

  • **kwargs – Additional keyword arguments specific to the attack method.

abstract attack(query_node: int, perturbations: int = 1) Dict[str, Any][source]

Perform an attack on the specified query node.

Parameters:
  • query_node (int) – Index of the node to attack.

  • perturbations (int, optional) – Number of perturbations to apply. Defaults to 1.

Returns:

A dictionary containing details of the attack, such as removed edges or modifications made.

Return type:

Dict[str, Any]

class roksana.attack_methods.DegreeAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

DegreeAttack Class

Implements an adversarial attack that removes edges based on the degree of connected nodes.

data

The graph dataset.

Type:

Any

params

Additional parameters for the attack.

Type:

dict

__init__(data: Any, **kwargs)[source]

Initialize the DegreeAttack method.

Parameters:
  • data (Any) – The graph dataset.

  • **kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) Tuple[Any, List[Tuple[int, int]]][source]

Perform the degree-based attack on the graph dataset.

Parameters:
  • data (Any) – The graph dataset.

  • selected_nodes (torch.Tensor) – Nodes to target for edge removal. Must be a 1D tensor.

Returns:

  • updated_data (Any): The modified graph dataset with updated edges.

  • removed_edges (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

class roksana.attack_methods.PageRankAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

PageRankAttack Class

Implements an adversarial attack that removes edges connected to nodes based on their PageRank scores.

data

The graph dataset.

Type:

Any

params

Additional parameters for the attack.

Type:

dict

__init__(data: Any, **kwargs)[source]

Initialize the PageRankAttack method.

Parameters:
  • data (Any) – The graph dataset.

  • **kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) Tuple[Any, List[Tuple[int, int]]][source]

Perform the PageRank-based attack on the graph dataset.

Parameters:
  • data (Any) – The graph dataset.

  • selected_nodes (torch.Tensor) – Nodes to target for edge removal. Must be a 1D tensor.

Returns:

  • updated_data (Any): The modified graph dataset with updated edges.

  • removed_edges (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

class roksana.attack_methods.RandomAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

RandomAttack Class

Implements an adversarial attack that randomly removes edges connected to specified nodes in a graph.

data

The graph dataset.

Type:

Any

params

Additional parameters for the attack.

Type:

dict

__init__(data: Any, **kwargs)[source]

Initialize the RandomAttack method.

Parameters:
  • data (Any) – The graph dataset.

  • **kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) Tuple[Any, List[Tuple[int, int]]][source]

Perform the random edge removal attack.

Parameters:
  • data (Any) – The graph dataset.

  • selected_nodes (torch.Tensor) – Nodes for which edges are to be removed. Can be a single node or a tensor of nodes.

Returns:

  • The modified graph dataset with updated edges.

  • A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

class roksana.attack_methods.VikingAttack(data: Any, **kwargs)[source]

Bases: BaseAttack

VikingAttack Class

Implements an adversarial attack by perturbing edges involving specified nodes in a graph.

data

The graph dataset.

Type:

Any

params

Additional parameters for the attack.

Type:

dict

__init__(data: Any, **kwargs)[source]

Initialize the VikingAttack method.

Parameters:
  • data (Any) – The graph dataset.

  • **kwargs – Additional parameters for the attack.

attack(data: Any, selected_nodes: Tensor) Tuple[Any, List[Tuple[int, int]]][source]

Execute the Viking perturbation attack.

Parameters:
  • data (Any) – The graph dataset.

  • selected_nodes (torch.Tensor) – Nodes to target for edge removal.

Returns:

  • updated_data (Any): The modified graph dataset with updated edges.

  • removed_edges (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[Any, List[Tuple[int, int]]]

perturbation_attack(data: Any, selected_nodes: Tensor) Tuple[Tensor, List[Tuple[int, int]]][source]

Perform the Viking perturbation attack by removing edges involving selected nodes.

Parameters:
  • data (Any) – The graph dataset.

  • selected_nodes (torch.Tensor) – Nodes to target for edge removal.

Returns:

  • retained_edges (torch.Tensor): The edge index after removal of edges.

  • edges_to_remove (List[Tuple[int, int]]): A list of removed edges.

Return type:

Tuple[torch.Tensor, List[Tuple[int, int]]]

roksana.attack_methods.get_attack_method(name: str, data: Any, **kwargs) BaseAttack[source]

Retrieve an instance of the specified attack method.

Parameters:
  • name (str) – Name of the attack method (e.g., ‘degree’, ‘pagerank’, ‘random’, ‘viking’).

  • data (Any) – The graph dataset.

  • **kwargs – Additional keyword arguments for initializing the attack method.

Returns:

An instance of the requested attack method.

Return type:

BaseAttack

Raises:

ValueError – If the specified attack method is not registered.

Example

>>> from roksana.attack_methods.registry import get_attack_method
>>> attack = get_attack_method('degree', data=my_graph, param1=value1)
class roksana.evaluation.Evaluator(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Bases: object

Evaluator class to assess the impact of attack methods on search strategies.

__init__(search_method_before, search_method_after, k_values: List[int] = [5, 10, 20])[source]

Initialize the Evaluator.

Parameters:
  • search_method_before – Instance of SearchMethod before attack.

  • search_method_after – Instance of SearchMethod after attack.

  • k_values (List[int], optional) – List of k values for Hit@k and Recall@k. Defaults to [5, 10, 20].

evaluate(queries: List[int], gold_sets: List[List[int]], results_dir: str = 'results', filename: str = 'evaluation_results.csv') None[source]

Perform evaluation on the given queries and save the results.

Parameters:
  • queries (List[int]) – List of query node indices.

  • gold_sets (List[List[int]]) – List of gold sets corresponding to each query.

  • results_dir (str, optional) – Directory to save the results file. Defaults to ‘results’.

  • filename (str, optional) – Name of the results file. Defaults to ‘evaluation_results.csv’.

get_all_results() List[Dict[str, Any]][source]

Retrieve all evaluation results.

Returns:

List of evaluation result dictionaries.

Return type:

List[Dict[str, Any]]

roksana.evaluation.demotion_value(before_attack_rank: int, after_attack_rank: int) int[source]

Calculate the Demotion Value metric.

Parameters:
  • before_attack_rank (int) – The rank of the target node before the attack.

  • after_attack_rank (int) – The rank of the target node after the attack.

Returns:

Difference in rank (after_attack_rank - before_attack_rank).

A positive value indicates demotion.

Return type:

int

roksana.evaluation.hit_at_k(retrieved: List[int], gold_set: List[int], k: int) float[source]

Calculate Hit@k metric.

Parameters:
  • retrieved (List[int]) – List of retrieved node indices.

  • gold_set (List[int]) – List of gold node indices.

  • k (int) – The k in Hit@k.

Returns:

Hit@k value (1 if at least one gold node is in the top-k, else 0).

Return type:

float

roksana.evaluation.recall_at_k(retrieved: List[int], gold_set: List[int], k: int) float[source]

Calculate Recall@k metric.

Parameters:
  • retrieved (List[int]) – List of retrieved node indices.

  • gold_set (List[int]) – List of gold node indices.

  • k (int) – The k in Recall@k.

Returns:

Recall@k value.

Return type:

float

roksana.evaluation.save_results_to_csv(results: List[Dict[str, Any]], filepath: str) None[source]

Save evaluation results to a CSV file.

Parameters:
  • results (List[Dict[str, Any]]) – List of evaluation result dictionaries.

  • filepath (str) – Path to the CSV file where results will be saved.

roksana.evaluation.save_results_to_json(results: List[Dict[str, Any]], filepath: str) None[source]

Save evaluation results to a JSON file.

Parameters:
  • results (List[Dict[str, Any]]) – List of evaluation result dictionaries.

  • filepath (str) – Path to the JSON file where results will be saved.

roksana.evaluation.save_results_to_pickle(results: List[Dict[str, Any]], filepath: str) None[source]

Save evaluation results to a Pickle file.

Parameters:
  • results (List[Dict[str, Any]]) – List of evaluation result dictionaries.

  • filepath (str) – Path to the Pickle file where results will be saved.