diff --git a/index.html b/index.html index 0825c6a3..9d50c058 100644 --- a/index.html +++ b/index.html @@ -78,7 +78,7 @@
+ + + +
+ federatedlearning.inf.um.es +
NEBULA is a cutting-edge platform designed to facilitate the -training of federated models within both centralized and decentralized -architectures. It streamlines the development, deployment, and -management of federated applications across physical and virtualized -devices.
+ +
NEBULA (previously known as Fedstellar1) is a cutting-edge platform designed to facilitate the training of federated models within both centralized and decentralized architectures. It streamlines the development, deployment, and management of federated applications across physical and virtualized devices.
NEBULA is developed by Enrique Tomás Martínez Beltrán in collaboration with the University of Murcia, armasuisse, and the University of Zurich.
+ + + + + + + + + +
NEBULA boasts a modular architecture that consists of three core -elements:
NEBULA boasts a modular architecture that consists of three core elements:
NEBULA is developed by Enrique Tomás Martínez -Beltrán in collaboration with the -University of Murcia, -Armasuisse, and the University of -Zurich (UZH).
For any questions, please contact Enrique Tomás Martínez Beltrán -(enriquetomas@um.es).
- - - +
To start using NEBULA, follow our detailed Installation Guide and User Manual. For any queries or contributions, check out our Contribution Guide.
We welcome contributions from the community to enhance NEBULA. If you are interested in contributing, please follow the next steps:
git checkout -b feature/your-feature
git commit -am 'Add new feature'
git push origin feature/your-feature
If you use NEBULA (or Fedstellar) in a scientific publication, we would appreciate using the following citations:
@article{MartinezBeltran:DFL:2023, + title = {{Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and Challenges}}, + author = {Mart{\'i}nez Beltr{\'a}n, Enrique Tom{\'a}s and Quiles P{\'e}rez, Mario and S{\'a}nchez S{\'a}nchez, Pedro Miguel and L{\'o}pez Bernal, Sergio and Bovet, G{\'e}r{\^o}me and Gil P{\'e}rez, Manuel and Mart{\'i}nez P{\'e}rez, Gregorio and Huertas Celdr{\'a}n, Alberto}, + year = 2023, + volume = {25}, + number = {4}, + pages = {2983-3013}, + journal = {IEEE Communications Surveys & Tutorials}, + doi = {10.1109/COMST.2023.3315746}, + preprint = {https://arxiv.org/abs/2211.08413} +} +
@article{MartinezBeltran:fedstellar:2024, + title = {{Fedstellar: A Platform for Decentralized Federated Learning}}, + author = {Mart{\'i}nez Beltr{\'a}n, Enrique Tom{\'a}s and Perales G{\'o}mez, {\'A}ngel Luis and Feng, Chao and S{\'a}nchez S{\'a}nchez, Pedro Miguel and L{\'o}pez Bernal, Sergio and Bovet, G{\'e}r{\^o}me and Gil P{\'e}rez, Manuel and Mart{\'i}nez P{\'e}rez, Gregorio and Huertas Celdr{\'a}n, Alberto}, + year = 2024, + volume = {242}, + issn = {0957-4174}, + pages = {122861}, + journal = {Expert Systems with Applications}, + doi = {10.1016/j.eswa.2023.122861}, + preprint = {https://arxiv.org/abs/2306.09750} +} +
@inproceedings{MartinezBeltran:fedstellar_demo:2023, + title = {{Fedstellar: A Platform for Training Models in a Privacy-preserving and Decentralized Fashion}}, + author = {Mart{\'i}nez Beltr{\'a}n, Enrique Tom{\'a}s and S{\'a}nchez S{\'a}nchez, Pedro Miguel and L{\'o}pez Bernal, Sergio and Bovet, G{\'e}r{\^o}me and Gil P{\'e}rez, Manuel and Mart{\'i}nez P{\'e}rez, Gregorio and Huertas Celdr{\'a}n, Alberto}, + year = 2023, + month = aug, + booktitle = {Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, {IJCAI-23}}, + publisher = {International Joint Conferences on Artificial Intelligence Organization}, + pages = {7154--7157}, + doi = {10.24963/ijcai.2023/838}, + note = {Demo Track}, + editor = {Edith Elkind} +} +
@article{MartinezBeltran:DFL_mitigating_threats:2023, + title = {{Mitigating Communications Threats in Decentralized Federated Learning through Moving Target Defense}}, + author = {Mart{\'i}nez Beltr{\'a}n, Enrique Tom{\'a}s and S{\'a}nchez S{\'a}nchez, Pedro Miguel and L{\'o}pez Bernal, Sergio and Bovet, G{\'e}r{\^o}me and Gil P{\'e}rez, Manuel and Mart{\'i}nez P{\'e}rez, Gregorio and Huertas Celdr{\'a}n, Alberto}, + year = 2024, + journal = {Wireless Networks}, + doi = {10.1007/s11276-024-03667-8} + preprint = {https://arxiv.org/abs/2307.11730} +} +
Distributed under the GNU GPLv3 License. See LICENSE for more information.
LICENSE
We would like to thank the following projects for their contributions which have helped shape NEBULA:
Fedstellar was our first version of the platform. We have redesigned the previous functionalities and added new capabilities based on our research. The platform is now called NEBULA and is available as an open-source project. ↩
NEBULA is a cutting-edge platform designed to facilitate the training of federated models within both centralized and decentralized architectures. It streamlines the development, deployment, and management of federated applications across physical and virtualized devices.
NEBULA is developed by Enrique Tom\u00e1s Mart\u00ednez Beltr\u00e1n in collaboration with the University of Murcia, Armasuisse, and the University of Zurich (UZH).
For any questions, please contact Enrique Tom\u00e1s Mart\u00ednez Beltr\u00e1n (enriquetomas@um.es).
All notable changes to this project will be documented in this file.
We welcome contributions to this project. Please read the following guidelines.
NEBULA is a modular, adaptable and extensible platform for creating centralized and decentralized architectures using Federated Learning. Also, the platform enables the creation of a standard approach for developing, deploying, and managing federated learning applications.
The platform enables developers to create distributed applications that use federated learning algorithms to improve user experience, security, and privacy. It provides features for managing data, managing models, and managing federated learning processes. It also provides a comprehensive set of tools to help developers monitor and analyze the performance of their applications.
Virtualenv is a tool to build isolated Python environments.
It\\'s a great way to quickly test new libraries without cluttering your global site-packages or run multiple projects on the same machine which depend on a particular library but not the same version of the library.
Since Python version 3.3, there is also a module in the standard library called [venv]{.title-ref} with roughly the same functionality.
In order to create a virtual environment called e.g. nebula using [venv]{.title-ref}, run:
$ python3 -m venv nebula-venv\n
Once the environment is created, you need to activate it. Just change directory into it and source the script [Scripts/activate]{.title-ref} or [bin/activate]{.title-ref}.
With bash:
$ cd nebula-venv\n$ . Scripts/activate\n(nebula-venv) $\n
With csh/tcsh:
$ cd nebula-venv\n$ source Scripts/activate\n(nebula-venv) $\n
Notice that the prompt changes once you are activate the environment. To deactivate it just type deactivate:
(nebula-venv) $ deactivate\n$\n
After you have created the environment, you can install nebula following the instructions below.
Obtaining the platform --------------------
You can obtain the source code from https://github.com/CyberDataLab/nebula
Or, if you happen to have git configured, you can clone the repository:
git clone https://github.com/CyberDataLab/nebula.git\n
Now, you can move to the source directory:
cd nebula\n
NEBULA requires the additional packages in order to be able to be installed and work properly.
You can install them using pip:
pip3 install -r requirements.txt\n
Once the installation is finished, you can check by listing the version of the NEBULA with the following command line:
python app/main.py --version\n
There are two ways to deploy the node in the federation: using Docker containers or isolated processes. You can choose the one that best fits your needs in the frontend.
You need to build the docker image using the following command line in the root directory:
docker build -t nebula-core .\n
In case of using GPU in the docker, you have to follow the instructions in the following link to install nvidia-container-toolkit:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
You can check the docker images using the following command line:
docker images\n
You need to install the requirements of the node (core) using the following command line in the root directory:
pip3 install -r nebula/requirements.txt\n
To run NEBULA, you can use the following command line:
python app/main.py [PARAMS]\n
The first time you run the platform, the nebula-frontend docker image will be built. This process can take a few minutes.
You can show the PARAMS using:
python app/main.py --help\n
The frontend will be available at http://127.0.0.1:6060 (by default)
To change the default port of the frontend, you can use the following command line:
python app/main.py --webport [PORT]\n
To change the default port of the statistics endpoint, you can use the following command line:
python app/main.py --statsport [PORT]\n
You can login with the following credentials:
- User: admin\n- Password: admin\n
If not working the default credentials, send an email to Enrique Tom\u00e1s Mart\u00ednez Beltr\u00e1n to get the credentials.
To stop NEBULA, you can use the following command line:
python app/main.py --stop\n
Be careful, this command will stop all the containers related to NEBULA: frontend, controller, and nodes.
If frontend is not working, check the logs in app/logs/server.log
If any of the following errors appear, take a look at the docker logs of the nebula-frontend container:
docker logs nebula-frontend
Network nebula_X Error failed to create network nebula_X: Error response from daemon: Pool overlaps with other one on this address space
Solution: Delete the docker network nebula_X
docker network rm nebula_X
Error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Solution: Start the docker daemon
sudo dockerd
Solution: Enable the following option in Docker Desktop
Settings -> Advanced -> Allow the default Docker socket to be used
Error: Cannot connect to the Docker daemon at tcp://X.X.X.X:2375. Is the docker daemon running?
sudo dockerd -H tcp://X.X.X.X:2375
If frontend is not working, restart docker daemon
sudo systemctl restart docker
Error: Too many open files
Solution: Increase the number of open files
ulimit -n 65536
Also, you can add the following lines to the file /etc/security/limits.conf
NebulaEventHandler
Bases: PatternMatchingEventHandler
PatternMatchingEventHandler
NebulaEventHandler handles file system events for .sh scripts.
This class monitors the creation, modification, and deletion of .sh scripts in a specified directory.
nebula/controller.py
class NebulaEventHandler(PatternMatchingEventHandler):\n \"\"\"\n NebulaEventHandler handles file system events for .sh scripts.\n\n This class monitors the creation, modification, and deletion of .sh scripts\n in a specified directory.\n \"\"\"\n\n patterns = [\"*.sh\", \"*.ps1\"]\n\n def __init__(self):\n super(NebulaEventHandler, self).__init__()\n self.last_processed = {}\n self.timeout_ns = 5 * 1e9\n self.processing_files = set()\n self.lock = threading.Lock()\n\n def _should_process_event(self, src_path: str) -> bool:\n current_time_ns = time.time_ns()\n logging.info(f\"Current time (ns): {current_time_ns}\")\n with self.lock:\n if src_path in self.last_processed:\n logging.info(f\"Last processed time for {src_path}: {self.last_processed[src_path]}\")\n last_time = self.last_processed[src_path]\n if current_time_ns - last_time < self.timeout_ns:\n return False\n self.last_processed[src_path] = current_time_ns\n return True\n\n def _is_being_processed(self, src_path: str) -> bool:\n with self.lock:\n if src_path in self.processing_files:\n logging.info(f\"Skipping {src_path} as it is already being processed.\")\n return True\n self.processing_files.add(src_path)\n return False\n\n def _processing_done(self, src_path: str):\n with self.lock:\n if src_path in self.processing_files:\n self.processing_files.remove(src_path)\n\n def on_created(self, event):\n \"\"\"\n Handles the event when a file is created.\n \"\"\"\n if event.is_directory:\n return\n src_path = event.src_path\n if not self._should_process_event(src_path):\n return\n if self._is_being_processed(src_path):\n return\n logging.info(\"File created: %s\" % src_path)\n try:\n self.run_script(src_path)\n finally:\n self._processing_done(src_path)\n\n def on_deleted(self, event):\n \"\"\"\n Handles the event when a file is deleted.\n \"\"\"\n if event.is_directory:\n return\n src_path = event.src_path\n if not self._should_process_event(src_path):\n return\n if self._is_being_processed(src_path):\n return\n logging.info(\"File deleted: %s\" % src_path)\n directory_script = os.path.dirname(src_path)\n pids_file = os.path.join(directory_script, \"current_scenario_pids.txt\")\n logging.info(f\"Killing processes from {pids_file}\")\n try:\n self.kill_script_processes(pids_file)\n os.remove(pids_file)\n except FileNotFoundError:\n logging.warning(f\"{pids_file} not found.\")\n except Exception as e:\n logging.exception(f\"Error while killing processes: {e}\")\n finally:\n self._processing_done(src_path)\n\n def run_script(self, script):\n try:\n logging.info(f\"Running script: {script}\")\n if script.endswith(\".sh\"):\n result = subprocess.run([\"bash\", script], capture_output=True, text=True)\n logging.info(f\"Script output:\\n{result.stdout}\")\n if result.stderr:\n logging.error(f\"Script error:\\n{result.stderr}\")\n elif script.endswith(\".ps1\"):\n subprocess.Popen(\n [\"powershell\", \"-ExecutionPolicy\", \"Bypass\", \"-File\", script],\n stdout=subprocess.PIPE,\n stderr=subprocess.PIPE,\n text=False,\n )\n else:\n logging.error(\"Unsupported script format.\")\n return\n except Exception as e:\n logging.exception(f\"Error while running script: {e}\")\n\n def kill_script_processes(self, pids_file):\n try:\n with open(pids_file) as f:\n pids = f.readlines()\n for pid in pids:\n try:\n pid = int(pid.strip())\n if psutil.pid_exists(pid):\n process = psutil.Process(pid)\n children = process.children(recursive=True)\n logging.info(f\"Forcibly killing process {pid} and {len(children)} child processes...\")\n for child in children:\n try:\n logging.info(f\"Forcibly killing child process {child.pid}\")\n child.kill()\n except psutil.NoSuchProcess:\n logging.warning(f\"Child process {child.pid} already terminated.\")\n except Exception as e:\n logging.exception(f\"Error while forcibly killing child process {child.pid}: {e}\")\n try:\n logging.info(f\"Forcibly killing main process {pid}\")\n process.kill()\n except psutil.NoSuchProcess:\n logging.warning(f\"Process {pid} already terminated.\")\n except Exception as e:\n logging.exception(f\"Error while forcibly killing main process {pid}: {e}\")\n else:\n logging.warning(f\"PID {pid} does not exist.\")\n except ValueError:\n logging.exception(f\"Invalid PID value in file: {pid}\")\n except Exception as e:\n logging.exception(f\"Error while forcibly killing process {pid}: {e}\")\n except FileNotFoundError:\n logging.exception(f\"PID file not found: {pids_file}\")\n except Exception as e:\n logging.exception(f\"Error while reading PIDs from file: {e}\")\n
on_created(event)
Handles the event when a file is created.
def on_created(self, event):\n \"\"\"\n Handles the event when a file is created.\n \"\"\"\n if event.is_directory:\n return\n src_path = event.src_path\n if not self._should_process_event(src_path):\n return\n if self._is_being_processed(src_path):\n return\n logging.info(\"File created: %s\" % src_path)\n try:\n self.run_script(src_path)\n finally:\n self._processing_done(src_path)\n
on_deleted(event)
Handles the event when a file is deleted.
def on_deleted(self, event):\n \"\"\"\n Handles the event when a file is deleted.\n \"\"\"\n if event.is_directory:\n return\n src_path = event.src_path\n if not self._should_process_event(src_path):\n return\n if self._is_being_processed(src_path):\n return\n logging.info(\"File deleted: %s\" % src_path)\n directory_script = os.path.dirname(src_path)\n pids_file = os.path.join(directory_script, \"current_scenario_pids.txt\")\n logging.info(f\"Killing processes from {pids_file}\")\n try:\n self.kill_script_processes(pids_file)\n os.remove(pids_file)\n except FileNotFoundError:\n logging.warning(f\"{pids_file} not found.\")\n except Exception as e:\n logging.exception(f\"Error while killing processes: {e}\")\n finally:\n self._processing_done(src_path)\n
Scenario
nebula/scenarios.py
class Scenario:\n def __init__(\n self,\n scenario_title,\n scenario_description,\n deployment,\n federation,\n topology,\n nodes,\n nodes_graph,\n n_nodes,\n matrix,\n dataset,\n iid,\n partition_selection,\n partition_parameter,\n model,\n agg_algorithm,\n rounds,\n logginglevel,\n accelerator,\n network_subnet,\n network_gateway,\n epochs,\n attacks,\n poisoned_node_percent,\n poisoned_sample_percent,\n poisoned_noise_percent,\n with_reputation,\n is_dynamic_topology,\n is_dynamic_aggregation,\n target_aggregation,\n random_geo,\n latitude,\n longitude,\n mobility,\n mobility_type,\n radius_federation,\n scheme_mobility,\n round_frequency,\n mobile_participants_percent,\n additional_participants,\n schema_additional_participants,\n ):\n \"\"\"\n Initialize the scenario.\n\n Args:\n scenario_title (str): Title of the scenario.\n scenario_description (str): Description of the scenario.\n deployment (str): Type of deployment (e.g., 'docker', 'process').\n federation (str): Type of federation.\n topology (str): Network topology.\n nodes (dict): Dictionary of nodes.\n nodes_graph (dict): Graph of nodes.\n n_nodes (int): Number of nodes.\n matrix (list): Matrix of connections.\n dataset (str): Dataset used.\n iid (bool): Indicator if data is independent and identically distributed.\n partition_selection (str): Method of partition selection.\n partition_parameter (float): Parameter for partition selection.\n model (str): Model used.\n agg_algorithm (str): Aggregation algorithm.\n rounds (int): Number of rounds.\n logginglevel (str): Logging level.\n accelerator (str): Accelerator used.\n network_subnet (str): Network subnet.\n network_gateway (str): Network gateway.\n epochs (int): Number of epochs.\n attacks (list): List of attacks.\n poisoned_node_percent (float): Percentage of poisoned nodes.\n poisoned_sample_percent (float): Percentage of poisoned samples.\n is_dynamic_topology (bool): Indicator if topology is dynamic.\n is_dynamic_aggregation (bool): Indicator if aggregation is dynamic.\n target_aggregation (str): Target aggregation method.\n random_geo (bool): Indicator if random geo is used.\n latitude (float): Latitude for mobility.\n longitude (float): Longitude for mobility.\n mobility (bool): Indicator if mobility is used.\n mobility_type (str): Type of mobility.\n radius_federation (float): Radius of federation.\n scheme_mobility (str): Scheme of mobility.\n round_frequency (int): Frequency of rounds.\n mobile_participants_percent (float): Percentage of mobile participants.\n additional_participants (list): List of additional participants.\n schema_additional_participants (str): Schema for additional participants.\n \"\"\"\n self.scenario_title = scenario_title\n self.scenario_description = scenario_description\n self.deployment = deployment\n self.federation = federation\n self.topology = topology\n self.nodes = nodes\n self.nodes_graph = nodes_graph\n self.n_nodes = n_nodes\n self.matrix = matrix\n self.dataset = dataset\n self.iid = iid\n self.partition_selection = partition_selection\n self.partition_parameter = partition_parameter\n self.model = model\n self.agg_algorithm = agg_algorithm\n self.rounds = rounds\n self.logginglevel = logginglevel\n self.accelerator = accelerator\n self.network_subnet = network_subnet\n self.network_gateway = network_gateway\n self.epochs = epochs\n self.attacks = attacks\n self.poisoned_node_percent = poisoned_node_percent\n self.poisoned_sample_percent = poisoned_sample_percent\n self.poisoned_noise_percent = poisoned_noise_percent\n self.with_reputation = with_reputation\n self.is_dynamic_topology = is_dynamic_topology\n self.is_dynamic_aggregation = is_dynamic_aggregation\n self.target_aggregation = target_aggregation\n self.random_geo = random_geo\n self.latitude = latitude\n self.longitude = longitude\n self.mobility = mobility\n self.mobility_type = mobility_type\n self.radius_federation = radius_federation\n self.scheme_mobility = scheme_mobility\n self.round_frequency = round_frequency\n self.mobile_participants_percent = mobile_participants_percent\n self.additional_participants = additional_participants\n self.schema_additional_participants = schema_additional_participants\n\n def attack_node_assign(\n self,\n nodes,\n federation,\n attack,\n poisoned_node_percent,\n poisoned_sample_percent,\n poisoned_noise_percent,\n ):\n \"\"\"Identify which nodes will be attacked\"\"\"\n import math\n import random\n\n nodes_index = []\n # Get the nodes index\n if federation == \"DFL\":\n nodes_index = list(nodes.keys())\n else:\n for node in nodes:\n if nodes[node][\"role\"] != \"server\":\n nodes_index.append(node)\n\n mal_nodes_defined = any(nodes[node][\"malicious\"] for node in nodes)\n\n attacked_nodes = []\n\n if not mal_nodes_defined:\n n_nodes = len(nodes_index)\n # Number of attacked nodes, round up\n num_attacked = int(math.ceil(poisoned_node_percent / 100 * n_nodes))\n if num_attacked > n_nodes:\n num_attacked = n_nodes\n\n # Get the index of attacked nodes\n attacked_nodes = random.sample(nodes_index, num_attacked)\n\n # Assign the role of each node\n for node in nodes:\n node_att = \"No Attack\"\n malicious = False\n attack_sample_percent = 0\n poisoned_ratio = 0\n if (str(nodes[node][\"id\"]) in attacked_nodes) or (nodes[node][\"malicious\"]):\n malicious = True\n node_att = attack\n attack_sample_percent = poisoned_sample_percent / 100\n poisoned_ratio = poisoned_noise_percent / 100\n nodes[node][\"malicious\"] = malicious\n nodes[node][\"attacks\"] = node_att\n nodes[node][\"poisoned_sample_percent\"] = attack_sample_percent\n nodes[node][\"poisoned_ratio\"] = poisoned_ratio\n return nodes\n\n def mobility_assign(self, nodes, mobile_participants_percent):\n \"\"\"Assign mobility to nodes\"\"\"\n import random\n\n # Number of mobile nodes, round down\n num_mobile = math.floor(mobile_participants_percent / 100 * len(nodes))\n if num_mobile > len(nodes):\n num_mobile = len(nodes)\n\n # Get the index of mobile nodes\n mobile_nodes = random.sample(list(nodes.keys()), num_mobile)\n\n # Assign the role of each node\n for node in nodes:\n node_mob = False\n if node in mobile_nodes:\n node_mob = True\n nodes[node][\"mobility\"] = node_mob\n return nodes\n\n @classmethod\n def from_dict(cls, data):\n return cls(**data)\n
__init__(scenario_title, scenario_description, deployment, federation, topology, nodes, nodes_graph, n_nodes, matrix, dataset, iid, partition_selection, partition_parameter, model, agg_algorithm, rounds, logginglevel, accelerator, network_subnet, network_gateway, epochs, attacks, poisoned_node_percent, poisoned_sample_percent, poisoned_noise_percent, with_reputation, is_dynamic_topology, is_dynamic_aggregation, target_aggregation, random_geo, latitude, longitude, mobility, mobility_type, radius_federation, scheme_mobility, round_frequency, mobile_participants_percent, additional_participants, schema_additional_participants)
Initialize the scenario.
Parameters:
scenario_title
str
Title of the scenario.
scenario_description
Description of the scenario.
deployment
Type of deployment (e.g., 'docker', 'process').
federation
Type of federation.
topology
Network topology.
nodes
dict
Dictionary of nodes.
nodes_graph
Graph of nodes.
n_nodes
int
Number of nodes.
matrix
list
Matrix of connections.
dataset
Dataset used.
iid
bool
Indicator if data is independent and identically distributed.
partition_selection
Method of partition selection.
partition_parameter
float
Parameter for partition selection.
model
Model used.
agg_algorithm
Aggregation algorithm.
rounds
Number of rounds.
logginglevel
Logging level.
accelerator
Accelerator used.
network_subnet
Network subnet.
network_gateway
Network gateway.
epochs
Number of epochs.
attacks
List of attacks.
poisoned_node_percent
Percentage of poisoned nodes.
poisoned_sample_percent
Percentage of poisoned samples.
is_dynamic_topology
Indicator if topology is dynamic.
is_dynamic_aggregation
Indicator if aggregation is dynamic.
target_aggregation
Target aggregation method.
random_geo
Indicator if random geo is used.
latitude
Latitude for mobility.
longitude
Longitude for mobility.
mobility
Indicator if mobility is used.
mobility_type
Type of mobility.
radius_federation
Radius of federation.
scheme_mobility
Scheme of mobility.
round_frequency
Frequency of rounds.
mobile_participants_percent
Percentage of mobile participants.
additional_participants
List of additional participants.
schema_additional_participants
Schema for additional participants.
def __init__(\n self,\n scenario_title,\n scenario_description,\n deployment,\n federation,\n topology,\n nodes,\n nodes_graph,\n n_nodes,\n matrix,\n dataset,\n iid,\n partition_selection,\n partition_parameter,\n model,\n agg_algorithm,\n rounds,\n logginglevel,\n accelerator,\n network_subnet,\n network_gateway,\n epochs,\n attacks,\n poisoned_node_percent,\n poisoned_sample_percent,\n poisoned_noise_percent,\n with_reputation,\n is_dynamic_topology,\n is_dynamic_aggregation,\n target_aggregation,\n random_geo,\n latitude,\n longitude,\n mobility,\n mobility_type,\n radius_federation,\n scheme_mobility,\n round_frequency,\n mobile_participants_percent,\n additional_participants,\n schema_additional_participants,\n):\n \"\"\"\n Initialize the scenario.\n\n Args:\n scenario_title (str): Title of the scenario.\n scenario_description (str): Description of the scenario.\n deployment (str): Type of deployment (e.g., 'docker', 'process').\n federation (str): Type of federation.\n topology (str): Network topology.\n nodes (dict): Dictionary of nodes.\n nodes_graph (dict): Graph of nodes.\n n_nodes (int): Number of nodes.\n matrix (list): Matrix of connections.\n dataset (str): Dataset used.\n iid (bool): Indicator if data is independent and identically distributed.\n partition_selection (str): Method of partition selection.\n partition_parameter (float): Parameter for partition selection.\n model (str): Model used.\n agg_algorithm (str): Aggregation algorithm.\n rounds (int): Number of rounds.\n logginglevel (str): Logging level.\n accelerator (str): Accelerator used.\n network_subnet (str): Network subnet.\n network_gateway (str): Network gateway.\n epochs (int): Number of epochs.\n attacks (list): List of attacks.\n poisoned_node_percent (float): Percentage of poisoned nodes.\n poisoned_sample_percent (float): Percentage of poisoned samples.\n is_dynamic_topology (bool): Indicator if topology is dynamic.\n is_dynamic_aggregation (bool): Indicator if aggregation is dynamic.\n target_aggregation (str): Target aggregation method.\n random_geo (bool): Indicator if random geo is used.\n latitude (float): Latitude for mobility.\n longitude (float): Longitude for mobility.\n mobility (bool): Indicator if mobility is used.\n mobility_type (str): Type of mobility.\n radius_federation (float): Radius of federation.\n scheme_mobility (str): Scheme of mobility.\n round_frequency (int): Frequency of rounds.\n mobile_participants_percent (float): Percentage of mobile participants.\n additional_participants (list): List of additional participants.\n schema_additional_participants (str): Schema for additional participants.\n \"\"\"\n self.scenario_title = scenario_title\n self.scenario_description = scenario_description\n self.deployment = deployment\n self.federation = federation\n self.topology = topology\n self.nodes = nodes\n self.nodes_graph = nodes_graph\n self.n_nodes = n_nodes\n self.matrix = matrix\n self.dataset = dataset\n self.iid = iid\n self.partition_selection = partition_selection\n self.partition_parameter = partition_parameter\n self.model = model\n self.agg_algorithm = agg_algorithm\n self.rounds = rounds\n self.logginglevel = logginglevel\n self.accelerator = accelerator\n self.network_subnet = network_subnet\n self.network_gateway = network_gateway\n self.epochs = epochs\n self.attacks = attacks\n self.poisoned_node_percent = poisoned_node_percent\n self.poisoned_sample_percent = poisoned_sample_percent\n self.poisoned_noise_percent = poisoned_noise_percent\n self.with_reputation = with_reputation\n self.is_dynamic_topology = is_dynamic_topology\n self.is_dynamic_aggregation = is_dynamic_aggregation\n self.target_aggregation = target_aggregation\n self.random_geo = random_geo\n self.latitude = latitude\n self.longitude = longitude\n self.mobility = mobility\n self.mobility_type = mobility_type\n self.radius_federation = radius_federation\n self.scheme_mobility = scheme_mobility\n self.round_frequency = round_frequency\n self.mobile_participants_percent = mobile_participants_percent\n self.additional_participants = additional_participants\n self.schema_additional_participants = schema_additional_participants\n
attack_node_assign(nodes, federation, attack, poisoned_node_percent, poisoned_sample_percent, poisoned_noise_percent)
Identify which nodes will be attacked
def attack_node_assign(\n self,\n nodes,\n federation,\n attack,\n poisoned_node_percent,\n poisoned_sample_percent,\n poisoned_noise_percent,\n):\n \"\"\"Identify which nodes will be attacked\"\"\"\n import math\n import random\n\n nodes_index = []\n # Get the nodes index\n if federation == \"DFL\":\n nodes_index = list(nodes.keys())\n else:\n for node in nodes:\n if nodes[node][\"role\"] != \"server\":\n nodes_index.append(node)\n\n mal_nodes_defined = any(nodes[node][\"malicious\"] for node in nodes)\n\n attacked_nodes = []\n\n if not mal_nodes_defined:\n n_nodes = len(nodes_index)\n # Number of attacked nodes, round up\n num_attacked = int(math.ceil(poisoned_node_percent / 100 * n_nodes))\n if num_attacked > n_nodes:\n num_attacked = n_nodes\n\n # Get the index of attacked nodes\n attacked_nodes = random.sample(nodes_index, num_attacked)\n\n # Assign the role of each node\n for node in nodes:\n node_att = \"No Attack\"\n malicious = False\n attack_sample_percent = 0\n poisoned_ratio = 0\n if (str(nodes[node][\"id\"]) in attacked_nodes) or (nodes[node][\"malicious\"]):\n malicious = True\n node_att = attack\n attack_sample_percent = poisoned_sample_percent / 100\n poisoned_ratio = poisoned_noise_percent / 100\n nodes[node][\"malicious\"] = malicious\n nodes[node][\"attacks\"] = node_att\n nodes[node][\"poisoned_sample_percent\"] = attack_sample_percent\n nodes[node][\"poisoned_ratio\"] = poisoned_ratio\n return nodes\n
mobility_assign(nodes, mobile_participants_percent)
Assign mobility to nodes
def mobility_assign(self, nodes, mobile_participants_percent):\n \"\"\"Assign mobility to nodes\"\"\"\n import random\n\n # Number of mobile nodes, round down\n num_mobile = math.floor(mobile_participants_percent / 100 * len(nodes))\n if num_mobile > len(nodes):\n num_mobile = len(nodes)\n\n # Get the index of mobile nodes\n mobile_nodes = random.sample(list(nodes.keys()), num_mobile)\n\n # Assign the role of each node\n for node in nodes:\n node_mob = False\n if node in mobile_nodes:\n node_mob = True\n nodes[node][\"mobility\"] = node_mob\n return nodes\n
print_msg_box(msg, indent=1, width=None, title=None, logger_name=None)
Print message-box with optional title.
nebula/addons/functions.py
def print_msg_box(msg, indent=1, width=None, title=None, logger_name=None):\n \"\"\"Print message-box with optional title.\"\"\"\n if logger_name:\n logger = logging.getLogger(logger_name)\n else:\n logger = logging.getLogger()\n\n if not isinstance(msg, str):\n raise TypeError(\"msg parameter must be a string\")\n\n lines = msg.split(\"\\n\")\n space = \" \" * indent\n if not width:\n width = max(map(len, lines))\n if title:\n width = max(width, len(title))\n box = f\"\\n\u2554{'\u2550' * (width + indent * 2)}\u2557\\n\" # upper_border\n if title:\n if not isinstance(title, str):\n raise TypeError(\"title parameter must be a string\")\n box += f\"\u2551{space}{title:<{width}}{space}\u2551\\n\" # title\n box += f\"\u2551{space}{'-' * len(title):<{width}}{space}\u2551\\n\" # underscore\n box += \"\".join([f\"\u2551{space}{line:<{width}}{space}\u2551\\n\" for line in lines])\n box += f\"\u255a{'\u2550' * (width + indent * 2)}\u255d\" # lower_border\n logger.info(box)\n
Attack
nebula/addons/attacks/attacks.py
class Attack:\n def __call__(self, *args: Any, **kwds: Any) -> Any:\n return self.attack(*args, **kwds)\n\n def attack(self, received_weights):\n \"\"\"\n Function to perform the attack on the received weights. It should return the\n attacked weights.\n \"\"\"\n raise NotImplementedError\n
attack(received_weights)
Function to perform the attack on the received weights. It should return the attacked weights.
def attack(self, received_weights):\n \"\"\"\n Function to perform the attack on the received weights. It should return the\n attacked weights.\n \"\"\"\n raise NotImplementedError\n
DelayerAttack
Bases: Attack
Function to perform delayer attack on the received weights. It delays the weights for an indefinite number of rounds.
class DelayerAttack(Attack):\n \"\"\"\n Function to perform delayer attack on the received weights. It delays the\n weights for an indefinite number of rounds.\n \"\"\"\n\n def __init__(self):\n super().__init__()\n self.weights = None\n\n def attack(self, received_weights):\n logging.info(\"[DelayerAttack] Performing delayer attack\")\n if self.weights is None:\n self.weights = deepcopy(received_weights)\n return self.weights\n
GLLNeuronInversionAttack
Function to perform neuron inversion attack on the received weights.
class GLLNeuronInversionAttack(Attack):\n \"\"\"\n Function to perform neuron inversion attack on the received weights.\n \"\"\"\n\n def __init__(self, strength=5.0, perc=1.0):\n super().__init__()\n self.strength = strength\n self.perc = perc\n\n def attack(self, received_weights):\n logging.info(\"[GLLNeuronInversionAttack] Performing neuron inversion attack\")\n lkeys = list(received_weights.keys())\n logging.info(f\"Layer inverted: {lkeys[-2]}\")\n received_weights[lkeys[-2]].data = torch.rand(received_weights[lkeys[-2]].shape) * 10000\n return received_weights\n
NoiseInjectionAttack
Function to perform noise injection attack on the received weights.
class NoiseInjectionAttack(Attack):\n \"\"\"\n Function to perform noise injection attack on the received weights.\n \"\"\"\n\n def __init__(self, strength=10000, perc=1.0):\n super().__init__()\n self.strength = strength\n self.perc = perc\n\n def attack(self, received_weights):\n logging.info(\"[NoiseInjectionAttack] Performing noise injection attack\")\n lkeys = list(received_weights.keys())\n for k in lkeys:\n logging.info(f\"Layer noised: {k}\")\n received_weights[k].data += torch.randn(received_weights[k].shape) * self.strength\n return received_weights\n
SwappingWeightsAttack
Function to perform swapping weights attack on the received weights. Note that this attack performance is not consistent due to its stochasticity.
Warning: depending on the layer the code may not work (due to reshaping in between), or it may be slow (scales quadratically with the layer size). Do not apply to last layer, as it would make the attack detectable (high loss on malicious node).
class SwappingWeightsAttack(Attack):\n \"\"\"\n Function to perform swapping weights attack on the received weights. Note that this\n attack performance is not consistent due to its stochasticity.\n\n Warning: depending on the layer the code may not work (due to reshaping in between),\n or it may be slow (scales quadratically with the layer size).\n Do not apply to last layer, as it would make the attack detectable (high loss\n on malicious node).\n \"\"\"\n\n def __init__(self, layer_idx=0):\n super().__init__()\n self.layer_idx = layer_idx\n\n def attack(self, received_weights):\n logging.info(\"[SwappingWeightsAttack] Performing swapping weights attack\")\n lkeys = list(received_weights.keys())\n wm = received_weights[lkeys[self.layer_idx]]\n\n # Compute similarity matrix\n sm = torch.zeros((wm.shape[0], wm.shape[0]))\n for j in range(wm.shape[0]):\n sm[j] = pairwise_cosine_similarity(wm[j].reshape(1, -1), wm.reshape(wm.shape[0], -1))\n\n # Check rows/cols where greedy approach is optimal\n nsort = np.full(sm.shape[0], -1)\n rows = []\n for j in range(sm.shape[0]):\n k = torch.argmin(sm[j])\n if torch.argmin(sm[:, k]) == j:\n nsort[j] = k\n rows.append(j)\n not_rows = np.array([i for i in range(sm.shape[0]) if i not in rows])\n\n # Ensure the rest of the rows are fully permuted (not optimal, but good enough)\n nrs = deepcopy(not_rows)\n nrs = np.random.permutation(nrs)\n while np.any(nrs == not_rows):\n nrs = np.random.permutation(nrs)\n nsort[not_rows] = nrs\n nsort = torch.tensor(nsort)\n\n # Apply permutation to weights\n received_weights[lkeys[self.layer_idx]] = received_weights[lkeys[self.layer_idx]][nsort]\n received_weights[lkeys[self.layer_idx + 1]] = received_weights[lkeys[self.layer_idx + 1]][nsort]\n if self.layer_idx + 2 < len(lkeys):\n received_weights[lkeys[self.layer_idx + 2]] = received_weights[lkeys[self.layer_idx + 2]][:, nsort]\n return received_weights\n
create_attack(attack_name)
Function to create an attack object from its name.
def create_attack(attack_name):\n \"\"\"\n Function to create an attack object from its name.\n \"\"\"\n if attack_name == \"GLLNeuronInversionAttack\":\n return GLLNeuronInversionAttack()\n elif attack_name == \"NoiseInjectionAttack\":\n return NoiseInjectionAttack()\n elif attack_name == \"SwappingWeightsAttack\":\n return SwappingWeightsAttack()\n elif attack_name == \"DelayerAttack\":\n return DelayerAttack()\n else:\n return None\n
add_x_to_image(img)
Add a 10*10 pixels X at the top-left of an image
nebula/addons/attacks/poisoning/datapoison.py
def add_x_to_image(img):\n \"\"\"\n Add a 10*10 pixels X at the top-left of an image\n \"\"\"\n for i in range(0, 10):\n for j in range(0, 10):\n if i + j <= 9 or i == j:\n img[i][j] = 255\n return torch.tensor(img)\n
datapoison(dataset, indices, poisoned_persent, poisoned_ratio, targeted=False, target_label=3, noise_type='salt')
Function to add random noise of various types to the dataset.
def datapoison(\n dataset,\n indices,\n poisoned_persent,\n poisoned_ratio,\n targeted=False,\n target_label=3,\n noise_type=\"salt\",\n):\n \"\"\"\n Function to add random noise of various types to the dataset.\n \"\"\"\n new_dataset = copy.deepcopy(dataset)\n train_data = new_dataset.data\n targets = new_dataset.targets\n num_indices = len(indices)\n if type(noise_type) != str:\n noise_type = noise_type[0]\n\n if targeted == False:\n num_poisoned = int(poisoned_persent * num_indices)\n if num_indices == 0:\n return new_dataset\n if num_poisoned > num_indices:\n return new_dataset\n poisoned_indice = random.sample(indices, num_poisoned)\n\n for i in poisoned_indice:\n t = train_data[i]\n if noise_type == \"salt\":\n # Replaces random pixels with 1.\n poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))\n elif noise_type == \"gaussian\":\n # Gaussian-distributed additive noise.\n poisoned = torch.tensor(random_noise(t, mode=noise_type, mean=0, var=poisoned_ratio, clip=True))\n elif noise_type == \"s&p\":\n # Replaces random pixels with either 1 or low_val, where low_val is 0 for unsigned images or -1 for signed images.\n poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))\n elif noise_type == \"nlp_rawdata\":\n # for NLP data, change the word vector to 0 with p=poisoned_ratio\n poisoned = poison_to_nlp_rawdata(t, poisoned_ratio)\n else:\n print(\"ERROR: poison attack type not supported.\")\n poisoned = t\n train_data[i] = poisoned\n else:\n for i in indices:\n if int(targets[i]) == int(target_label):\n t = train_data[i]\n poisoned = add_x_to_image(t)\n train_data[i] = poisoned\n new_dataset.data = train_data\n return new_dataset\n
poison_to_nlp_rawdata(text_data, poisoned_ratio)
for NLP data, change the word vector to 0 with p=poisoned_ratio
def poison_to_nlp_rawdata(text_data, poisoned_ratio):\n \"\"\"\n for NLP data, change the word vector to 0 with p=poisoned_ratio\n \"\"\"\n non_zero_vector_indice = [i for i in range(0, len(text_data)) if text_data[i][0] != 0]\n non_zero_vector_len = len(non_zero_vector_indice)\n\n num_poisoned_token = int(poisoned_ratio * non_zero_vector_len)\n if num_poisoned_token == 0:\n return text_data\n if num_poisoned_token > non_zero_vector_len:\n return text_data\n\n poisoned_token_indice = random.sample(non_zero_vector_indice, num_poisoned_token)\n zero_vector = torch.Tensor(np.zeros(len(text_data[0][0])))\n for i in poisoned_token_indice:\n text_data[i] = zero_vector\n return text_data\n
labelFlipping(dataset, indices, poisoned_persent=0, targeted=False, target_label=4, target_changed_label=7)
select flipping_persent of labels, and change them to random values. Args: dataset: the dataset of training data, torch.util.data.dataset like. indices: Indices of subsets, list like. flipping_persent: The ratio of labels want to change, float like.
nebula/addons/attacks/poisoning/labelflipping.py
def labelFlipping(\n dataset,\n indices,\n poisoned_persent=0,\n targeted=False,\n target_label=4,\n target_changed_label=7,\n):\n \"\"\"\n select flipping_persent of labels, and change them to random values.\n Args:\n dataset: the dataset of training data, torch.util.data.dataset like.\n indices: Indices of subsets, list like.\n flipping_persent: The ratio of labels want to change, float like.\n \"\"\"\n new_dataset = copy.deepcopy(dataset)\n targets = new_dataset.targets.detach().clone()\n num_indices = len(indices)\n # classes = new_dataset.classes\n # class_to_idx = new_dataset.class_to_idx\n # class_list = [class_to_idx[i] for i in classes]\n class_list = set(targets.tolist())\n if targeted == False:\n num_flipped = int(poisoned_persent * num_indices)\n if num_indices == 0:\n return new_dataset\n if num_flipped > num_indices:\n return new_dataset\n flipped_indice = random.sample(indices, num_flipped)\n\n for i in flipped_indice:\n t = targets[i]\n flipped = torch.tensor(random.sample(class_list, 1)[0])\n while t == flipped:\n flipped = torch.tensor(random.sample(class_list, 1)[0])\n targets[i] = flipped\n else:\n for i in indices:\n if int(targets[i]) == int(target_label):\n targets[i] = torch.tensor(target_changed_label)\n new_dataset.targets = targets\n return new_dataset\n
modelpoison(model, poisoned_ratio, noise_type='gaussian')
Function to add random noise of various types to the model parameter.
nebula/addons/attacks/poisoning/modelpoison.py
def modelpoison(model: OrderedDict, poisoned_ratio, noise_type=\"gaussian\"):\n \"\"\"\n Function to add random noise of various types to the model parameter.\n \"\"\"\n poisoned_model = OrderedDict()\n if type(noise_type) != str:\n noise_type = noise_type[0]\n\n for layer in model:\n bt = model[layer]\n t = bt.detach().clone()\n single_point = False\n if len(t.shape) == 0:\n t = t.view(-1)\n single_point = True\n # print(t)\n if noise_type == \"salt\":\n # Replaces random pixels with 1.\n poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))\n elif noise_type == \"gaussian\":\n # Gaussian-distributed additive noise.\n poisoned = torch.tensor(random_noise(t, mode=noise_type, mean=0, var=poisoned_ratio, clip=True))\n elif noise_type == \"s&p\":\n # Replaces random pixels with either 1 or low_val, where low_val is 0 for unsigned images or -1 for signed images.\n poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))\n else:\n print(\"ERROR: poison attack type not supported.\")\n poisoned = t\n if single_point:\n poisoned = poisoned[0]\n poisoned_model[layer] = poisoned\n\n return poisoned_model\n
BlockchainDeployer
Creates files (docker-compose.yaml and genesis.json) for deploying blockchain network
nebula/addons/blockchain/blockchain_deployer.py
class BlockchainDeployer:\n \"\"\"\n Creates files (docker-compose.yaml and genesis.json) for deploying blockchain network\n \"\"\"\n\n def __init__(self, n_validator=3, config_dir=\".\", input_dir=\".\"):\n # root dir of blockchain folder\n self.__input_dir = input_dir\n\n # config folder for storing generated files for deployment\n self.__config_dir = config_dir\n\n # random but static id of boot node to be assigned to all other nodes\n self.__boot_id = None\n\n # ip address of boot node (needs to be static)\n self.__boot_ip = \"172.25.0.101\"\n\n # ip address of non-validator node (needs to be static)\n self.__rpc_ip = \"172.25.0.104\"\n\n # ip address of oracle (needs to be static)\n self.__oracle_ip = \"172.25.0.105\"\n\n # temporary yaml parameter to store config before dump\n self.__yaml = \"\"\n\n # list of reserved addresses which need to be excluded in random address generation\n self.__reserved_addresses = set()\n\n # load original genesis dict\n self.__genesis = self.__load_genesis()\n\n # create blockchain directory in scenario's config directory\n self.__setup_dir()\n\n # add a boot node to the yaml file\n self.__add_boot_node()\n\n # add n validator nodes to the genesis.json and yaml file\n self.__add_validator(n_validator)\n\n # add non-validator node to the yaml file\n self.__add_rpc()\n\n # add oracle node to the genesis.json and yaml file\n self.__add_oracle()\n\n # dump config files into scenario's config directory\n self.__export_config()\n\n def __setup_dir(self) -> None:\n if not os.path.exists(self.__config_dir):\n os.makedirs(self.__config_dir, exist_ok=True)\n\n def __get_unreserved_address(self) -> tuple[int, int]:\n \"\"\"\n Computes a randomized port and last 8 bits of an ip address, where both are not yet used\n Returns: Randomized and unreserved lat 8 bit of ip and port\n\n \"\"\"\n\n # extract reserved ports and ip addresses\n reserved_ips = [address[0] for address in self.__reserved_addresses]\n reserved_ports = [address[1] for address in self.__reserved_addresses]\n\n # get randomized ip and port in range still unreserved\n ip = random.choice([number for number in range(10, 254) if number not in reserved_ips])\n port = random.choice([number for number in range(30310, 30360) if number not in reserved_ports])\n\n # add network address to list of reserved addresses\n self.__reserved_addresses.add((ip, port))\n return ip, port\n\n def __copy_dir(self, source_path) -> None:\n \"\"\"\n Copy blockchain folder with current files such as chaincode to config folder\n Args:\n source_path: Path of dir to copy\n\n Returns: None\n\n \"\"\"\n\n curr_path = os.path.dirname(os.path.abspath(__file__))\n\n if not os.path.exists(self.__config_dir):\n os.makedirs(self.__config_dir, exist_ok=True)\n\n target_dir = os.path.join(self.__config_dir, source_path)\n source_dir = os.path.join(curr_path, source_path)\n shutil.copytree(str(source_dir), target_dir, dirs_exist_ok=True)\n\n @staticmethod\n def __load_genesis() -> dict[str, int | str | dict]:\n \"\"\"\n Load original genesis config\n Returns: Genesis json dict\n\n \"\"\"\n return {\n \"config\": {\n \"chainId\": 19265019, # unique id not used by any public Ethereum network\n # block number at which the defined EIP hard fork policies are applied\n \"homesteadBlock\": 0,\n \"eip150Block\": 0,\n \"eip155Block\": 0,\n \"eip158Block\": 0,\n \"byzantiumBlock\": 0,\n \"constantinopleBlock\": 0,\n \"petersburgBlock\": 0,\n \"istanbulBlock\": 0,\n \"muirGlacierBlock\": 0,\n \"berlinBlock\": 0,\n # Proof-of-Authority settings\n \"clique\": {\n \"period\": 1,\n \"epoch\": 10000,\n }, # block time (time in seconds between two blocks) # number of blocks after reset the pending votes\n },\n # unique continuous id of transactions used by PoA\n \"nonce\": \"0x0\",\n # UNIX timestamp of block creation\n \"timestamp\": \"0x5a8efd25\",\n # strictly formated string containing all public wallet addresses of all validators (PoA)\n # will be replaced by public addresses of randomly generated validator node\n \"extraData\": \"0x0000000000000000000000000000000000000000000000000000000000000000187c1c14c75bA185A59c621Fbe5dda26D488852DF20C144e8aE3e1aCF7071C4883B759D1B428e7930000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\",\n # maximum gas (computational cost) per transaction\n \"gasLimit\": \"9000000000000\", # \"8000000\" is default for Ethereum but too low for heavy load\n # difficulty for PoW\n \"difficulty\": \"0x1\",\n # root hash of block\n \"mixHash\": \"0x0000000000000000000000000000000000000000000000000000000000000000\",\n # validator of genesis block\n \"coinbase\": \"0x0000000000000000000000000000000000000000\",\n # prefunded public wallet addresses (Oracle)\n \"alloc\": {\n # will be replaced by Oracle's randomized address\n \"0x61DE01FcD560da4D6e05E58bCD34C8Dc92CE36D1\": {\n \"balance\": \"0x200000000000000000000000000000000000000000000000000000000000000\"\n }\n },\n # block number of genesis block\n \"number\": \"0x0\",\n # gas used to validate genesis block\n \"gasUsed\": \"0x0\",\n # hash of parent block (0x0 since first block)\n \"parentHash\": \"0x0000000000000000000000000000000000000000000000000000000000000000\",\n }\n\n def __add_boot_node(self) -> None:\n \"\"\"\n Adds boot node to docker-compose.yaml\n Returns: None\n\n \"\"\"\n\n # create random private key and create account from it\n acc = w3.eth.account.create()\n\n # store id of boot node to be inserted into all other nodes\n self.__boot_id = str(keys.PrivateKey(acc.key).public_key)[2:]\n\n # add service to yaml string\n self.__yaml += textwrap.dedent(\n f\"\"\"\n geth-bootnode:\n hostname: geth-bootnode\n environment:\n - nodekeyhex={w3.to_hex(acc.key)[2:]}\n build:\n dockerfile: {self.__input_dir}/geth/boot.dockerfile\n container_name: boot\n networks:\n chainnet:\n ipv4_address: {self.__boot_ip}\n \"\"\"\n )\n\n def __add_validator(self, cnt) -> None:\n \"\"\"\n Randomly generates and adds number(cnt) of validator nodes to yaml and genesis.json\n Args:\n cnt: number of validator nodes to cresate\n\n Returns: None\n\n \"\"\"\n validator_addresses = list()\n\n for id in range(cnt):\n # create random private key and create account from it\n acc = w3.eth.account.create()\n validator_addresses.append(acc.address[2:])\n\n # get random network address\n ip, port = self.__get_unreserved_address()\n\n self.__yaml += textwrap.dedent(\n f\"\"\"\n geth-validator-{id}:\n hostname: geth-validator-{id}\n depends_on:\n - geth-bootnode\n environment:\n - address={acc.address}\n - bootnodeId={self.__boot_id}\n - bootnodeIp={self.__boot_ip}\n - port={port}\n build:\n dockerfile: {self.__input_dir}/geth/validator.dockerfile\n args:\n privatekey: {w3.to_hex(acc.key)[2:]}\n password: {w3.to_hex(w3.eth.account.create().key)}\n container_name: validator_{id}\n networks:\n chainnet:\n ipv4_address: 172.25.0.{ip}\n \"\"\"\n )\n\n # create specific Ethereum extra data string for PoA with all public addresses of validators\n extra_data = \"0x\" + \"0\" * 64 + \"\".join([a for a in validator_addresses]) + 65 * \"0\" + 65 * \"0\"\n self.__genesis[\"extraData\"] = extra_data\n\n def __add_oracle(self) -> None:\n \"\"\"\n Adds Oracle node to yaml and genesis.json\n Returns: None\n\n \"\"\"\n\n # create random private key and create account from it\n acc = w3.eth.account.create()\n\n # prefund oracle by allocating all funds to its public wallet address\n self.__genesis[\"alloc\"] = {\n acc.address: {\"balance\": \"0x200000000000000000000000000000000000000000000000000000000000000\"}\n }\n\n self.__yaml += textwrap.dedent(\n f\"\"\"\n oracle:\n hostname: oracle\n depends_on:\n - geth-rpc\n - geth-bootnode\n environment:\n - PRIVATE_KEY={w3.to_hex(acc.key)[2:]}\n - RPC_IP={self.__rpc_ip}\n build:\n dockerfile: {self.__input_dir}/geth/oracle.dockerfile\n context: {self.__input_dir}\n ports:\n - 8081:8081\n container_name: oracle\n networks:\n chainnet:\n ipv4_address: {self.__oracle_ip}\n \"\"\"\n )\n\n def __add_rpc(self):\n \"\"\"\n Add non-validator node to yaml\n Returns: None\n\n \"\"\"\n # create random private key and create account from it\n acc = w3.eth.account.create()\n\n self.__yaml += textwrap.dedent(\n f\"\"\"\n geth-rpc:\n hostname: geth-rpc\n depends_on:\n - geth-bootnode\n environment:\n - address={acc.address}\n - bootnodeId={self.__boot_id}\n - bootnodeIp={self.__boot_ip}\n build:\n dockerfile: {self.__input_dir}/geth/rpc.dockerfile\n ports:\n - 8545:8545\n container_name: rpc\n networks:\n chainnet:\n ipv4_address: {self.__rpc_ip}\n \"\"\"\n )\n\n def __add_network(self) -> None:\n \"\"\"\n Adds network config to docker-compose.yaml to create a private network for docker compose\n Returns: None\n\n \"\"\"\n self.__yaml += textwrap.dedent(\n \"\"\"\n networks:\n chainnet:\n name: chainnet\n driver: bridge\n ipam:\n config:\n - subnet: 172.25.0.0/24\n \"\"\"\n )\n\n def __export_config(self) -> None:\n \"\"\"\n Writes configured yaml and genesis files to config folder for deplyoment\n Returns: None\n\n \"\"\"\n\n # format yaml and add docker compose properties\n final_str = textwrap.indent(f\"\"\"{self.__yaml}\"\"\", \" \")\n\n self.__yaml = textwrap.dedent(\n \"\"\"\n version: \"3.8\"\n name: blockchain\n services:\n \"\"\"\n )\n\n self.__yaml += final_str\n\n # add network config last\n self.__add_network()\n\n with open(f\"{self.__config_dir}/blockchain-docker-compose.yml\", \"w+\") as file:\n file.write(self.__yaml)\n\n with open(f\"{self.__input_dir}/geth/genesis.json\", \"w+\") as file:\n json.dump(self.__genesis, file, indent=4)\n\n source = os.path.join(self.__input_dir, \"geth\", \"genesis.json\")\n shutil.copy(source, os.path.join(self.__config_dir, \"genesis.json\"))\n\n source = os.path.join(self.__input_dir, \"chaincode\", \"reputation_system.sol\")\n shutil.copy(source, os.path.join(self.__config_dir, \"reputation_system.sol\"))\n
__add_boot_node()
Adds boot node to docker-compose.yaml Returns: None
def __add_boot_node(self) -> None:\n \"\"\"\n Adds boot node to docker-compose.yaml\n Returns: None\n\n \"\"\"\n\n # create random private key and create account from it\n acc = w3.eth.account.create()\n\n # store id of boot node to be inserted into all other nodes\n self.__boot_id = str(keys.PrivateKey(acc.key).public_key)[2:]\n\n # add service to yaml string\n self.__yaml += textwrap.dedent(\n f\"\"\"\n geth-bootnode:\n hostname: geth-bootnode\n environment:\n - nodekeyhex={w3.to_hex(acc.key)[2:]}\n build:\n dockerfile: {self.__input_dir}/geth/boot.dockerfile\n container_name: boot\n networks:\n chainnet:\n ipv4_address: {self.__boot_ip}\n \"\"\"\n )\n
__add_network()
Adds network config to docker-compose.yaml to create a private network for docker compose Returns: None
def __add_network(self) -> None:\n \"\"\"\n Adds network config to docker-compose.yaml to create a private network for docker compose\n Returns: None\n\n \"\"\"\n self.__yaml += textwrap.dedent(\n \"\"\"\n networks:\n chainnet:\n name: chainnet\n driver: bridge\n ipam:\n config:\n - subnet: 172.25.0.0/24\n \"\"\"\n )\n
__add_oracle()
Adds Oracle node to yaml and genesis.json Returns: None
def __add_oracle(self) -> None:\n \"\"\"\n Adds Oracle node to yaml and genesis.json\n Returns: None\n\n \"\"\"\n\n # create random private key and create account from it\n acc = w3.eth.account.create()\n\n # prefund oracle by allocating all funds to its public wallet address\n self.__genesis[\"alloc\"] = {\n acc.address: {\"balance\": \"0x200000000000000000000000000000000000000000000000000000000000000\"}\n }\n\n self.__yaml += textwrap.dedent(\n f\"\"\"\n oracle:\n hostname: oracle\n depends_on:\n - geth-rpc\n - geth-bootnode\n environment:\n - PRIVATE_KEY={w3.to_hex(acc.key)[2:]}\n - RPC_IP={self.__rpc_ip}\n build:\n dockerfile: {self.__input_dir}/geth/oracle.dockerfile\n context: {self.__input_dir}\n ports:\n - 8081:8081\n container_name: oracle\n networks:\n chainnet:\n ipv4_address: {self.__oracle_ip}\n \"\"\"\n )\n
__add_rpc()
Add non-validator node to yaml Returns: None
def __add_rpc(self):\n \"\"\"\n Add non-validator node to yaml\n Returns: None\n\n \"\"\"\n # create random private key and create account from it\n acc = w3.eth.account.create()\n\n self.__yaml += textwrap.dedent(\n f\"\"\"\n geth-rpc:\n hostname: geth-rpc\n depends_on:\n - geth-bootnode\n environment:\n - address={acc.address}\n - bootnodeId={self.__boot_id}\n - bootnodeIp={self.__boot_ip}\n build:\n dockerfile: {self.__input_dir}/geth/rpc.dockerfile\n ports:\n - 8545:8545\n container_name: rpc\n networks:\n chainnet:\n ipv4_address: {self.__rpc_ip}\n \"\"\"\n )\n
__add_validator(cnt)
Randomly generates and adds number(cnt) of validator nodes to yaml and genesis.json Args: cnt: number of validator nodes to cresate
Returns: None
def __add_validator(self, cnt) -> None:\n \"\"\"\n Randomly generates and adds number(cnt) of validator nodes to yaml and genesis.json\n Args:\n cnt: number of validator nodes to cresate\n\n Returns: None\n\n \"\"\"\n validator_addresses = list()\n\n for id in range(cnt):\n # create random private key and create account from it\n acc = w3.eth.account.create()\n validator_addresses.append(acc.address[2:])\n\n # get random network address\n ip, port = self.__get_unreserved_address()\n\n self.__yaml += textwrap.dedent(\n f\"\"\"\n geth-validator-{id}:\n hostname: geth-validator-{id}\n depends_on:\n - geth-bootnode\n environment:\n - address={acc.address}\n - bootnodeId={self.__boot_id}\n - bootnodeIp={self.__boot_ip}\n - port={port}\n build:\n dockerfile: {self.__input_dir}/geth/validator.dockerfile\n args:\n privatekey: {w3.to_hex(acc.key)[2:]}\n password: {w3.to_hex(w3.eth.account.create().key)}\n container_name: validator_{id}\n networks:\n chainnet:\n ipv4_address: 172.25.0.{ip}\n \"\"\"\n )\n\n # create specific Ethereum extra data string for PoA with all public addresses of validators\n extra_data = \"0x\" + \"0\" * 64 + \"\".join([a for a in validator_addresses]) + 65 * \"0\" + 65 * \"0\"\n self.__genesis[\"extraData\"] = extra_data\n
__copy_dir(source_path)
Copy blockchain folder with current files such as chaincode to config folder Args: source_path: Path of dir to copy
def __copy_dir(self, source_path) -> None:\n \"\"\"\n Copy blockchain folder with current files such as chaincode to config folder\n Args:\n source_path: Path of dir to copy\n\n Returns: None\n\n \"\"\"\n\n curr_path = os.path.dirname(os.path.abspath(__file__))\n\n if not os.path.exists(self.__config_dir):\n os.makedirs(self.__config_dir, exist_ok=True)\n\n target_dir = os.path.join(self.__config_dir, source_path)\n source_dir = os.path.join(curr_path, source_path)\n shutil.copytree(str(source_dir), target_dir, dirs_exist_ok=True)\n
__export_config()
Writes configured yaml and genesis files to config folder for deplyoment Returns: None
def __export_config(self) -> None:\n \"\"\"\n Writes configured yaml and genesis files to config folder for deplyoment\n Returns: None\n\n \"\"\"\n\n # format yaml and add docker compose properties\n final_str = textwrap.indent(f\"\"\"{self.__yaml}\"\"\", \" \")\n\n self.__yaml = textwrap.dedent(\n \"\"\"\n version: \"3.8\"\n name: blockchain\n services:\n \"\"\"\n )\n\n self.__yaml += final_str\n\n # add network config last\n self.__add_network()\n\n with open(f\"{self.__config_dir}/blockchain-docker-compose.yml\", \"w+\") as file:\n file.write(self.__yaml)\n\n with open(f\"{self.__input_dir}/geth/genesis.json\", \"w+\") as file:\n json.dump(self.__genesis, file, indent=4)\n\n source = os.path.join(self.__input_dir, \"geth\", \"genesis.json\")\n shutil.copy(source, os.path.join(self.__config_dir, \"genesis.json\"))\n\n source = os.path.join(self.__input_dir, \"chaincode\", \"reputation_system.sol\")\n shutil.copy(source, os.path.join(self.__config_dir, \"reputation_system.sol\"))\n
__get_unreserved_address()
Computes a randomized port and last 8 bits of an ip address, where both are not yet used Returns: Randomized and unreserved lat 8 bit of ip and port
def __get_unreserved_address(self) -> tuple[int, int]:\n \"\"\"\n Computes a randomized port and last 8 bits of an ip address, where both are not yet used\n Returns: Randomized and unreserved lat 8 bit of ip and port\n\n \"\"\"\n\n # extract reserved ports and ip addresses\n reserved_ips = [address[0] for address in self.__reserved_addresses]\n reserved_ports = [address[1] for address in self.__reserved_addresses]\n\n # get randomized ip and port in range still unreserved\n ip = random.choice([number for number in range(10, 254) if number not in reserved_ips])\n port = random.choice([number for number in range(30310, 30360) if number not in reserved_ports])\n\n # add network address to list of reserved addresses\n self.__reserved_addresses.add((ip, port))\n return ip, port\n
__load_genesis()
staticmethod
Load original genesis config Returns: Genesis json dict
@staticmethod\ndef __load_genesis() -> dict[str, int | str | dict]:\n \"\"\"\n Load original genesis config\n Returns: Genesis json dict\n\n \"\"\"\n return {\n \"config\": {\n \"chainId\": 19265019, # unique id not used by any public Ethereum network\n # block number at which the defined EIP hard fork policies are applied\n \"homesteadBlock\": 0,\n \"eip150Block\": 0,\n \"eip155Block\": 0,\n \"eip158Block\": 0,\n \"byzantiumBlock\": 0,\n \"constantinopleBlock\": 0,\n \"petersburgBlock\": 0,\n \"istanbulBlock\": 0,\n \"muirGlacierBlock\": 0,\n \"berlinBlock\": 0,\n # Proof-of-Authority settings\n \"clique\": {\n \"period\": 1,\n \"epoch\": 10000,\n }, # block time (time in seconds between two blocks) # number of blocks after reset the pending votes\n },\n # unique continuous id of transactions used by PoA\n \"nonce\": \"0x0\",\n # UNIX timestamp of block creation\n \"timestamp\": \"0x5a8efd25\",\n # strictly formated string containing all public wallet addresses of all validators (PoA)\n # will be replaced by public addresses of randomly generated validator node\n \"extraData\": \"0x0000000000000000000000000000000000000000000000000000000000000000187c1c14c75bA185A59c621Fbe5dda26D488852DF20C144e8aE3e1aCF7071C4883B759D1B428e7930000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\",\n # maximum gas (computational cost) per transaction\n \"gasLimit\": \"9000000000000\", # \"8000000\" is default for Ethereum but too low for heavy load\n # difficulty for PoW\n \"difficulty\": \"0x1\",\n # root hash of block\n \"mixHash\": \"0x0000000000000000000000000000000000000000000000000000000000000000\",\n # validator of genesis block\n \"coinbase\": \"0x0000000000000000000000000000000000000000\",\n # prefunded public wallet addresses (Oracle)\n \"alloc\": {\n # will be replaced by Oracle's randomized address\n \"0x61DE01FcD560da4D6e05E58bCD34C8Dc92CE36D1\": {\n \"balance\": \"0x200000000000000000000000000000000000000000000000000000000000000\"\n }\n },\n # block number of genesis block\n \"number\": \"0x0\",\n # gas used to validate genesis block\n \"gasUsed\": \"0x0\",\n # hash of parent block (0x0 since first block)\n \"parentHash\": \"0x0000000000000000000000000000000000000000000000000000000000000000\",\n }\n
Oracle
nebula/addons/blockchain/oracle/app.py
class Oracle:\n def __init__(self):\n # header file, required for interacting with chain code\n self.__contract_abi = dict()\n\n # stores gas expenses for experiments\n self.__gas_store = list()\n\n # stores timing records for experiments\n self.__time_store = list()\n\n # stores reputation records for experiments\n self.__reputation_store = list()\n\n # current (03.2024) average amount of WEI to pay for a unit of gas\n self.__gas_price_per_unit = 27.3\n\n # current (03.2024) average price in USD per WEI\n self.__price_USD_per_WEI = 0.00001971\n\n # static ip address of non-validator node (RPC)\n self.__blockchain_address = \"http://172.25.0.104:8545\"\n\n # executes RPC request to non-validator node until ready\n self.__ready = self.wait_for_blockchain()\n\n # creates an account from the primary key stored in the envs\n self.acc = self.__create_account()\n\n # create Web3 object for making transactions\n self.__web3 = self.__initialize_web3()\n\n # create a Web3 contract object from the compiled chaincode\n self.contract_obj = self.__compile_chaincode()\n\n # deploy the contract to the blockchain network\n self.__contract_address = self.deploy_chaincode()\n\n # update the contract object with the address\n self.contract_obj = self.__web3.eth.contract(\n abi=self.contract_obj.abi,\n bytecode=self.contract_obj.bytecode,\n address=self.contract_address,\n )\n\n @property\n def contract_abi(self):\n return self.__contract_abi\n\n @property\n def contract_address(self):\n return self.__contract_address\n\n @retry((Exception, requests.exceptions.HTTPError), tries=20, delay=10)\n def wait_for_blockchain(self) -> bool:\n \"\"\"\n Executes REST post request for a selected RPC method to check if blockchain\n is up and running\n Returns: None\n\n \"\"\"\n headers = {\"Content-type\": \"application/json\", \"Accept\": \"application/json\"}\n\n data = {\"jsonrpc\": \"2.0\", \"method\": \"eth_accounts\", \"id\": 1, \"params\": []}\n\n request = requests.post(url=self.__blockchain_address, json=data, headers=headers)\n\n # raise Exception if status is an error one\n request.raise_for_status()\n\n print(\"ORACLE: RPC node up and running\", flush=True)\n\n return True\n\n def __initialize_web3(self):\n \"\"\"\n Initializes Web3 object and configures it for PoA protocol\n Returns: Web3 object\n\n \"\"\"\n\n # initialize Web3 object with ip of non-validator node\n web3 = Web3(Web3.HTTPProvider(self.__blockchain_address, request_kwargs={\"timeout\": 20})) # 10\n\n # inject Proof-of-Authority settings to object\n web3.middleware_onion.inject(geth_poa_middleware, layer=0)\n\n # automatically sign transactions if available for execution\n web3.middleware_onion.add(construct_sign_and_send_raw_middleware(self.acc))\n\n # inject local account as default\n web3.eth.default_account = self.acc.address\n\n # return initialized object for executing transaction\n print(f\"SUCCESS: Account created at {self.acc.address}\")\n return web3\n\n def __compile_chaincode(self):\n \"\"\"\n Compile raw chaincode and create Web3 contract object with it\n Returns: Web3 contract object\n\n \"\"\"\n\n # open raw solidity file\n with open(\"reputation_system.sol\") as file:\n simple_storage_file = file.read()\n\n # set compiler version\n install_solc(\"0.8.22\")\n\n # compile solidity code\n compiled_sol = compile_standard(\n {\n \"language\": \"Solidity\",\n \"sources\": {\"reputation_system.sol\": {\"content\": simple_storage_file}},\n \"settings\": {\n \"evmVersion\": \"paris\",\n \"outputSelection\": {\"*\": {\"*\": [\"abi\", \"metadata\", \"evm.bytecode\", \"evm.sourceMap\"]}},\n \"optimizer\": {\"enabled\": True, \"runs\": 1000},\n },\n },\n solc_version=\"0.8.22\",\n )\n\n # store compiled code as json\n with open(\"compiled_code.json\", \"w\") as file:\n json.dump(compiled_sol, file)\n\n # retrieve bytecode from the compiled contract\n contract_bytecode = compiled_sol[\"contracts\"][\"reputation_system.sol\"][\"ReputationSystem\"][\"evm\"][\"bytecode\"][\n \"object\"\n ]\n\n # retrieve ABI from compiled contract\n self.__contract_abi = json.loads(\n compiled_sol[\"contracts\"][\"reputation_system.sol\"][\"ReputationSystem\"][\"metadata\"]\n )[\"output\"][\"abi\"]\n\n print(\"Oracle: Solidity files compiled and bytecode ready\", flush=True)\n\n # return draft Web3 contract object\n return self.__web3.eth.contract(abi=self.__contract_abi, bytecode=contract_bytecode)\n\n @staticmethod\n def __create_account():\n \"\"\"\n Retrieves the private key from the envs, set during docker build\n Returns: Web3 account object\n\n \"\"\"\n\n # retrieve private key, set during ducker build\n private_key = os.environ.get(\"PRIVATE_KEY\")\n\n # return Web3 account object\n return Account.from_key(\"0x\" + private_key)\n\n @retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\n def transfer_funds(self, address):\n \"\"\"\n Creates transaction to blockchain network for assigning funds to Cores\n Args:\n address: public wallet address of Core to assign funds to\n\n Returns: Transaction receipt\n\n \"\"\"\n\n # create raw transaction with all required parameters to change state of ledger\n raw_transaction = {\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.acc.address,\n \"value\": self.__web3.to_wei(\"500\", \"ether\"),\n \"to\": self.__web3.to_checksum_address(address),\n \"nonce\": self.__web3.eth.get_transaction_count(self.acc.address, \"pending\"),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price_per_unit, \"gwei\"),\n \"gas\": self.__web3.to_wei(\"22000\", \"wei\"),\n }\n\n # sign transaction with private key and execute it\n tx_receipt = self.__sign_and_deploy(raw_transaction)\n\n # return transaction receipt\n return f\"SUCESS: {tx_receipt}\"\n\n def __sign_and_deploy(self, trx_hash):\n \"\"\"\n Signs a function call to the chain code with the primary key and awaits the receipt\n Args:\n trx_hash: Transformed dictionary of all properties relevant for call to chain code\n\n Returns: transaction receipt confirming the successful write to the ledger\n\n \"\"\"\n\n # transaction is signed with private key\n signed_transaction = self.__web3.eth.account.sign_transaction(trx_hash, private_key=self.acc.key)\n\n # confirmation that transaction was passed from non-validator node to validator nodes\n executed_transaction = self.__web3.eth.send_raw_transaction(signed_transaction.rawTransaction)\n\n # non-validator node awaited the successful validation by validation nodes and returns receipt\n transaction_receipt = self.__web3.eth.wait_for_transaction_receipt(executed_transaction, timeout=20) # 5\n\n # report used gas for experiment\n self.report_gas(transaction_receipt.gasUsed, 0)\n\n return transaction_receipt\n\n @retry(Exception, tries=20, delay=5)\n def deploy_chaincode(self):\n \"\"\"\n Creates transaction to deploy chain code on the blockchain network by\n sending transaction to non-validator node\n Returns: address of chain code on the network\n\n \"\"\"\n\n # create raw transaction with all properties to deploy contract\n raw_transaction = self.contract_obj.constructor().build_transaction({\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.acc.address,\n \"value\": self.__web3.to_wei(\"3\", \"ether\"),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price_per_unit, \"gwei\"),\n \"nonce\": self.__web3.eth.get_transaction_count(self.acc.address, \"pending\"),\n })\n\n # sign transaction with private key and executes it\n tx_receipt = self.__sign_and_deploy(raw_transaction)\n\n # store the address received from the non-validator node\n contract_address = tx_receipt[\"contractAddress\"]\n\n # returns contract address to provide to the cores later\n return contract_address\n\n def get_balance(self, addr):\n \"\"\"\n Creates transaction to blockchain network to request balance for parameter address\n Args:\n addr: public wallet address of account\n\n Returns: current balance in ether (ETH)\n\n \"\"\"\n\n # converts address type required for making a transaction\n cAddr = self.__web3.to_checksum_address(addr)\n\n # executes the transaction directly, no signing required\n balance = self.__web3.eth.get_balance(cAddr, \"pending\")\n\n # returns JSON response with ether balance to requesting core\n return {\"address\": cAddr, \"balance_eth\": self.__web3.from_wei(balance, \"ether\")}\n\n def report_gas(self, amount: int, aggregation_round: int) -> None:\n \"\"\"\n Experiment method for collecting and reporting gas usage statistics\n Args:\n aggregation_round: Aggregation round of sender\n amount: Amount of gas spent in WEI\n\n Returns: None\n\n \"\"\"\n\n # store the recorded gas for experiment\n self.__gas_store.append((amount, aggregation_round))\n\n def get_gas_report(self) -> Mapping[str, str]:\n \"\"\"\n Experiment method for requesting the summed up records of reported gas usage\n Returns: JSON with name:value (WEI/USD) for every reported node\n\n \"\"\"\n # sum up all reported costs\n total_wei = sum(record[0] for record in self.__gas_store)\n\n # convert sum in WEI to USD by computing with gas price USD per WEI\n total_usd = round(total_wei * self.__price_USD_per_WEI)\n\n return {\"Sum (WEI)\": total_wei, \"Sum (USD)\": f\"{total_usd:,}\"}\n\n @property\n def gas_store(self):\n \"\"\"\n Experiment method for requesting the detailed records of the gas reports\n Returns: list of records of type: list[(node, timestamp, gas)]\n\n \"\"\"\n return self.__gas_store\n\n def report_time(self, time_s: float, aggregation_round: int) -> None:\n \"\"\"\n Experiment method for collecting and reporting time statistics\n Args:\n aggregation_round: Aggregation round of node\n method: Name of node which reports time\n time_s: Amount of time spend on method\n\n Returns: None\n\n \"\"\"\n\n # store the recorded time for experiment\n self.__time_store.append((time_s, aggregation_round))\n\n def report_reputation(self, records: list, aggregation_round: int, sender: str) -> None:\n \"\"\"\n Experiment method for collecting and reporting reputations statistics\n Args:\n aggregation_round: Current aggregation round of sender\n records: list of (name:reputation) records\n sender: node reporting its local view\n\n Returns: None\n\n \"\"\"\n\n # store the recorded reputation for experiment\n self.__reputation_store.extend([(record[0], record[1], aggregation_round, sender) for record in records])\n\n @property\n def time_store(self) -> list:\n \"\"\"\n Experiment method for requesting all records of nodes which reported timings\n Returns: JSON with method:(sum_time, n_calls) for every reported node\n\n \"\"\"\n return self.__time_store\n\n @property\n def reputation_store(self) -> list:\n \"\"\"\n Experiment method for requesting all records of reputations\n Returns: list with (name, reputation, timestamp)\n\n \"\"\"\n return self.__reputation_store\n\n @property\n def ready(self) -> bool:\n \"\"\"\n Returns true if the Oracle is ready itself and the chain code was deployed successfully\n Returns: True if ready False otherwise\n\n \"\"\"\n return self.__ready\n
gas_store
property
Experiment method for requesting the detailed records of the gas reports Returns: list of records of type: list[(node, timestamp, gas)]
ready: bool
Returns true if the Oracle is ready itself and the chain code was deployed successfully Returns: True if ready False otherwise
reputation_store: list
Experiment method for requesting all records of reputations Returns: list with (name, reputation, timestamp)
time_store: list
Experiment method for requesting all records of nodes which reported timings Returns: JSON with method:(sum_time, n_calls) for every reported node
__compile_chaincode()
Compile raw chaincode and create Web3 contract object with it Returns: Web3 contract object
def __compile_chaincode(self):\n \"\"\"\n Compile raw chaincode and create Web3 contract object with it\n Returns: Web3 contract object\n\n \"\"\"\n\n # open raw solidity file\n with open(\"reputation_system.sol\") as file:\n simple_storage_file = file.read()\n\n # set compiler version\n install_solc(\"0.8.22\")\n\n # compile solidity code\n compiled_sol = compile_standard(\n {\n \"language\": \"Solidity\",\n \"sources\": {\"reputation_system.sol\": {\"content\": simple_storage_file}},\n \"settings\": {\n \"evmVersion\": \"paris\",\n \"outputSelection\": {\"*\": {\"*\": [\"abi\", \"metadata\", \"evm.bytecode\", \"evm.sourceMap\"]}},\n \"optimizer\": {\"enabled\": True, \"runs\": 1000},\n },\n },\n solc_version=\"0.8.22\",\n )\n\n # store compiled code as json\n with open(\"compiled_code.json\", \"w\") as file:\n json.dump(compiled_sol, file)\n\n # retrieve bytecode from the compiled contract\n contract_bytecode = compiled_sol[\"contracts\"][\"reputation_system.sol\"][\"ReputationSystem\"][\"evm\"][\"bytecode\"][\n \"object\"\n ]\n\n # retrieve ABI from compiled contract\n self.__contract_abi = json.loads(\n compiled_sol[\"contracts\"][\"reputation_system.sol\"][\"ReputationSystem\"][\"metadata\"]\n )[\"output\"][\"abi\"]\n\n print(\"Oracle: Solidity files compiled and bytecode ready\", flush=True)\n\n # return draft Web3 contract object\n return self.__web3.eth.contract(abi=self.__contract_abi, bytecode=contract_bytecode)\n
__create_account()
Retrieves the private key from the envs, set during docker build Returns: Web3 account object
@staticmethod\ndef __create_account():\n \"\"\"\n Retrieves the private key from the envs, set during docker build\n Returns: Web3 account object\n\n \"\"\"\n\n # retrieve private key, set during ducker build\n private_key = os.environ.get(\"PRIVATE_KEY\")\n\n # return Web3 account object\n return Account.from_key(\"0x\" + private_key)\n
__initialize_web3()
Initializes Web3 object and configures it for PoA protocol Returns: Web3 object
def __initialize_web3(self):\n \"\"\"\n Initializes Web3 object and configures it for PoA protocol\n Returns: Web3 object\n\n \"\"\"\n\n # initialize Web3 object with ip of non-validator node\n web3 = Web3(Web3.HTTPProvider(self.__blockchain_address, request_kwargs={\"timeout\": 20})) # 10\n\n # inject Proof-of-Authority settings to object\n web3.middleware_onion.inject(geth_poa_middleware, layer=0)\n\n # automatically sign transactions if available for execution\n web3.middleware_onion.add(construct_sign_and_send_raw_middleware(self.acc))\n\n # inject local account as default\n web3.eth.default_account = self.acc.address\n\n # return initialized object for executing transaction\n print(f\"SUCCESS: Account created at {self.acc.address}\")\n return web3\n
__sign_and_deploy(trx_hash)
Signs a function call to the chain code with the primary key and awaits the receipt Args: trx_hash: Transformed dictionary of all properties relevant for call to chain code
Returns: transaction receipt confirming the successful write to the ledger
def __sign_and_deploy(self, trx_hash):\n \"\"\"\n Signs a function call to the chain code with the primary key and awaits the receipt\n Args:\n trx_hash: Transformed dictionary of all properties relevant for call to chain code\n\n Returns: transaction receipt confirming the successful write to the ledger\n\n \"\"\"\n\n # transaction is signed with private key\n signed_transaction = self.__web3.eth.account.sign_transaction(trx_hash, private_key=self.acc.key)\n\n # confirmation that transaction was passed from non-validator node to validator nodes\n executed_transaction = self.__web3.eth.send_raw_transaction(signed_transaction.rawTransaction)\n\n # non-validator node awaited the successful validation by validation nodes and returns receipt\n transaction_receipt = self.__web3.eth.wait_for_transaction_receipt(executed_transaction, timeout=20) # 5\n\n # report used gas for experiment\n self.report_gas(transaction_receipt.gasUsed, 0)\n\n return transaction_receipt\n
deploy_chaincode()
Creates transaction to deploy chain code on the blockchain network by sending transaction to non-validator node Returns: address of chain code on the network
@retry(Exception, tries=20, delay=5)\ndef deploy_chaincode(self):\n \"\"\"\n Creates transaction to deploy chain code on the blockchain network by\n sending transaction to non-validator node\n Returns: address of chain code on the network\n\n \"\"\"\n\n # create raw transaction with all properties to deploy contract\n raw_transaction = self.contract_obj.constructor().build_transaction({\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.acc.address,\n \"value\": self.__web3.to_wei(\"3\", \"ether\"),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price_per_unit, \"gwei\"),\n \"nonce\": self.__web3.eth.get_transaction_count(self.acc.address, \"pending\"),\n })\n\n # sign transaction with private key and executes it\n tx_receipt = self.__sign_and_deploy(raw_transaction)\n\n # store the address received from the non-validator node\n contract_address = tx_receipt[\"contractAddress\"]\n\n # returns contract address to provide to the cores later\n return contract_address\n
get_balance(addr)
Creates transaction to blockchain network to request balance for parameter address Args: addr: public wallet address of account
Returns: current balance in ether (ETH)
def get_balance(self, addr):\n \"\"\"\n Creates transaction to blockchain network to request balance for parameter address\n Args:\n addr: public wallet address of account\n\n Returns: current balance in ether (ETH)\n\n \"\"\"\n\n # converts address type required for making a transaction\n cAddr = self.__web3.to_checksum_address(addr)\n\n # executes the transaction directly, no signing required\n balance = self.__web3.eth.get_balance(cAddr, \"pending\")\n\n # returns JSON response with ether balance to requesting core\n return {\"address\": cAddr, \"balance_eth\": self.__web3.from_wei(balance, \"ether\")}\n
get_gas_report()
Experiment method for requesting the summed up records of reported gas usage Returns: JSON with name:value (WEI/USD) for every reported node
def get_gas_report(self) -> Mapping[str, str]:\n \"\"\"\n Experiment method for requesting the summed up records of reported gas usage\n Returns: JSON with name:value (WEI/USD) for every reported node\n\n \"\"\"\n # sum up all reported costs\n total_wei = sum(record[0] for record in self.__gas_store)\n\n # convert sum in WEI to USD by computing with gas price USD per WEI\n total_usd = round(total_wei * self.__price_USD_per_WEI)\n\n return {\"Sum (WEI)\": total_wei, \"Sum (USD)\": f\"{total_usd:,}\"}\n
report_gas(amount, aggregation_round)
Experiment method for collecting and reporting gas usage statistics Args: aggregation_round: Aggregation round of sender amount: Amount of gas spent in WEI
def report_gas(self, amount: int, aggregation_round: int) -> None:\n \"\"\"\n Experiment method for collecting and reporting gas usage statistics\n Args:\n aggregation_round: Aggregation round of sender\n amount: Amount of gas spent in WEI\n\n Returns: None\n\n \"\"\"\n\n # store the recorded gas for experiment\n self.__gas_store.append((amount, aggregation_round))\n
report_reputation(records, aggregation_round, sender)
Experiment method for collecting and reporting reputations statistics Args: aggregation_round: Current aggregation round of sender records: list of (name:reputation) records sender: node reporting its local view
def report_reputation(self, records: list, aggregation_round: int, sender: str) -> None:\n \"\"\"\n Experiment method for collecting and reporting reputations statistics\n Args:\n aggregation_round: Current aggregation round of sender\n records: list of (name:reputation) records\n sender: node reporting its local view\n\n Returns: None\n\n \"\"\"\n\n # store the recorded reputation for experiment\n self.__reputation_store.extend([(record[0], record[1], aggregation_round, sender) for record in records])\n
report_time(time_s, aggregation_round)
Experiment method for collecting and reporting time statistics Args: aggregation_round: Aggregation round of node method: Name of node which reports time time_s: Amount of time spend on method
def report_time(self, time_s: float, aggregation_round: int) -> None:\n \"\"\"\n Experiment method for collecting and reporting time statistics\n Args:\n aggregation_round: Aggregation round of node\n method: Name of node which reports time\n time_s: Amount of time spend on method\n\n Returns: None\n\n \"\"\"\n\n # store the recorded time for experiment\n self.__time_store.append((time_s, aggregation_round))\n
transfer_funds(address)
Creates transaction to blockchain network for assigning funds to Cores Args: address: public wallet address of Core to assign funds to
Returns: Transaction receipt
@retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\ndef transfer_funds(self, address):\n \"\"\"\n Creates transaction to blockchain network for assigning funds to Cores\n Args:\n address: public wallet address of Core to assign funds to\n\n Returns: Transaction receipt\n\n \"\"\"\n\n # create raw transaction with all required parameters to change state of ledger\n raw_transaction = {\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.acc.address,\n \"value\": self.__web3.to_wei(\"500\", \"ether\"),\n \"to\": self.__web3.to_checksum_address(address),\n \"nonce\": self.__web3.eth.get_transaction_count(self.acc.address, \"pending\"),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price_per_unit, \"gwei\"),\n \"gas\": self.__web3.to_wei(\"22000\", \"wei\"),\n }\n\n # sign transaction with private key and execute it\n tx_receipt = self.__sign_and_deploy(raw_transaction)\n\n # return transaction receipt\n return f\"SUCESS: {tx_receipt}\"\n
wait_for_blockchain()
Executes REST post request for a selected RPC method to check if blockchain is up and running Returns: None
@retry((Exception, requests.exceptions.HTTPError), tries=20, delay=10)\ndef wait_for_blockchain(self) -> bool:\n \"\"\"\n Executes REST post request for a selected RPC method to check if blockchain\n is up and running\n Returns: None\n\n \"\"\"\n headers = {\"Content-type\": \"application/json\", \"Accept\": \"application/json\"}\n\n data = {\"jsonrpc\": \"2.0\", \"method\": \"eth_accounts\", \"id\": 1, \"params\": []}\n\n request = requests.post(url=self.__blockchain_address, json=data, headers=headers)\n\n # raise Exception if status is an error one\n request.raise_for_status()\n\n print(\"ORACLE: RPC node up and running\", flush=True)\n\n return True\n
error_handler(func)
Adds default status and header to all REST responses used for Oracle
def error_handler(func):\n \"\"\"Adds default status and header to all REST responses used for Oracle\"\"\"\n\n @wraps(func)\n def wrapper(*args, **kwargs):\n try:\n return func(*args, **kwargs), 200, {\"Content-Type\": \"application/json\"}\n except Exception as e:\n return jsonify({\"error\": str(e)}), 500, {\"Content-Type\": \"application/json\"}\n\n return wrapper\n
check_properties(*args)
Check if all the arguments have values.
args
All the arguments.
()
Returns:
The mean of arguments that have values.
nebula/addons/trustworthiness/calculation.py
def check_properties(*args):\n \"\"\"\n Check if all the arguments have values.\n\n Args:\n args (list): All the arguments.\n\n Returns:\n float: The mean of arguments that have values.\n \"\"\"\n\n result = map(lambda x: x is not None and x != \"\", args)\n return np.mean(list(result))\n
get_avg_loss_accuracy(loss_files, accuracy_files)
Calculates the mean accuracy and loss models of the nodes.
loss_files
Files that contain the loss of the models of the nodes.
accuracy_files
Files that contain the acurracies of the models of the nodes.
3-tupla: The mean loss of the models, the mean accuracies of the models, the standard deviation of the accuracies of the models.
def get_avg_loss_accuracy(loss_files, accuracy_files):\n \"\"\"\n Calculates the mean accuracy and loss models of the nodes.\n\n Args:\n loss_files (list): Files that contain the loss of the models of the nodes.\n accuracy_files (list): Files that contain the acurracies of the models of the nodes.\n\n Returns:\n 3-tupla: The mean loss of the models, the mean accuracies of the models, the standard deviation of the accuracies of the models.\n \"\"\"\n total_accuracy = 0\n total_loss = 0\n number_files = len(loss_files)\n accuracies = []\n\n for file_loss, file_accuracy in zip(loss_files, accuracy_files, strict=False):\n with open(file_loss) as f:\n loss = f.read()\n\n with open(file_accuracy) as f:\n accuracy = f.read()\n\n total_loss += float(loss)\n total_accuracy += float(accuracy)\n accuracies.append(float(accuracy))\n\n avg_loss = total_loss / number_files\n avg_accuracy = total_accuracy / number_files\n\n std_accuracy = statistics.stdev(accuracies)\n\n return avg_loss, avg_accuracy, std_accuracy\n
get_bytes_models(models_files)
Calculates the mean bytes of the final models of the nodes.
models_files
List of final models.
The mean bytes of the models.
def get_bytes_models(models_files):\n \"\"\"\n Calculates the mean bytes of the final models of the nodes.\n\n Args:\n models_files (list): List of final models.\n\n Returns:\n float: The mean bytes of the models.\n \"\"\"\n\n total_models_size = 0\n number_models = len(models_files)\n\n for file in models_files:\n model_size = os.path.getsize(file)\n total_models_size += model_size\n\n avg_model_size = total_models_size / number_models\n\n return avg_model_size\n
get_bytes_sent_recv(bytes_sent_files, bytes_recv_files)
Calculates the mean bytes sent and received of the nodes.
bytes_sent_files
Files that contain the bytes sent of the nodes.
bytes_recv_files
Files that contain the bytes received of the nodes.
4-tupla: The total bytes sent, the total bytes received, the mean bytes sent and the mean bytes received of the nodes.
def get_bytes_sent_recv(bytes_sent_files, bytes_recv_files):\n \"\"\"\n Calculates the mean bytes sent and received of the nodes.\n\n Args:\n bytes_sent_files (list): Files that contain the bytes sent of the nodes.\n bytes_recv_files (list): Files that contain the bytes received of the nodes.\n\n Returns:\n 4-tupla: The total bytes sent, the total bytes received, the mean bytes sent and the mean bytes received of the nodes.\n \"\"\"\n total_upload_bytes = 0\n total_download_bytes = 0\n number_files = len(bytes_sent_files)\n\n for file_bytes_sent, file_bytes_recv in zip(bytes_sent_files, bytes_recv_files, strict=False):\n with open(file_bytes_sent) as f:\n bytes_sent = f.read()\n\n with open(file_bytes_recv) as f:\n bytes_recv = f.read()\n\n total_upload_bytes += int(bytes_sent)\n total_download_bytes += int(bytes_recv)\n\n avg_upload_bytes = total_upload_bytes / number_files\n avg_download_bytes = total_download_bytes / number_files\n return (\n total_upload_bytes,\n total_download_bytes,\n avg_upload_bytes,\n avg_download_bytes,\n )\n
get_clever_score(model, test_sample, nb_classes, learning_rate)
Calculates the CLEVER score.
object
The model.
test_sample
One test sample to calculate the CLEVER score.
nb_classes
The nb_classes of the model.
learning_rate
The learning rate of the model.
The CLEVER score.
def get_clever_score(model, test_sample, nb_classes, learning_rate):\n \"\"\"\n Calculates the CLEVER score.\n\n Args:\n model (object): The model.\n test_sample (object): One test sample to calculate the CLEVER score.\n nb_classes (int): The nb_classes of the model.\n learning_rate (float): The learning rate of the model.\n\n Returns:\n float: The CLEVER score.\n \"\"\"\n\n images, _ = test_sample\n background = images[-1]\n\n criterion = nn.CrossEntropyLoss()\n optimizer = optim.Adam(model.parameters(), learning_rate)\n\n # Create the ART classifier\n classifier = PyTorchClassifier(\n model=model,\n loss=criterion,\n optimizer=optimizer,\n input_shape=(1, 28, 28),\n nb_classes=nb_classes,\n )\n\n score_untargeted = clever_u(\n classifier,\n background.numpy(),\n 10,\n 5,\n R_L2,\n norm=2,\n pool_factor=3,\n verbose=False,\n )\n return score_untargeted\n
get_cv(list=None, std=None, mean=None)
Get the coefficient of variation.
List in which the coefficient of variation will be calculated.
None
std
Standard deviation of a list.
mean
Mean of a list.
The coefficient of variation calculated.
def get_cv(list=None, std=None, mean=None):\n \"\"\"\n Get the coefficient of variation.\n\n Args:\n list (list): List in which the coefficient of variation will be calculated.\n std (float): Standard deviation of a list.\n mean (float): Mean of a list.\n\n Returns:\n float: The coefficient of variation calculated.\n \"\"\"\n if std is not None and mean is not None:\n return std / mean\n\n if list is not None:\n return np.std(list) / np.mean(list)\n\n return 0\n
get_elapsed_time(scenario)
Calculates the elapsed time during the execution of the scenario.
scenario
Scenario required.
The elapsed time.
def get_elapsed_time(scenario):\n \"\"\"\n Calculates the elapsed time during the execution of the scenario.\n\n Args:\n scenario (object): Scenario required.\n\n Returns:\n float: The elapsed time.\n \"\"\"\n start_time = scenario[1]\n end_time = scenario[2]\n\n start_date = datetime.strptime(start_time, \"%d/%m/%Y %H:%M:%S\")\n end_date = datetime.strptime(end_time, \"%d/%m/%Y %H:%M:%S\")\n\n elapsed_time = (end_date - start_date).total_seconds() / 60\n\n return elapsed_time\n
get_feature_importance_cv(model, test_sample)
Calculates the coefficient of variation of the feature importance.
One test sample to calculate the feature importance.
The coefficient of variation of the feature importance.
def get_feature_importance_cv(model, test_sample):\n \"\"\"\n Calculates the coefficient of variation of the feature importance.\n\n Args:\n model (object): The model.\n test_sample (object): One test sample to calculate the feature importance.\n\n Returns:\n float: The coefficient of variation of the feature importance.\n \"\"\"\n\n try:\n cv = 0\n batch_size = 10\n device = \"cpu\"\n\n if isinstance(model, torch.nn.Module):\n batched_data, _ = test_sample\n\n n = batch_size\n m = math.floor(0.8 * n)\n\n background = batched_data[:m].to(device)\n test_data = batched_data[m:n].to(device)\n\n e = shap.DeepExplainer(model, background)\n shap_values = e.shap_values(test_data)\n if shap_values is not None and len(shap_values) > 0:\n sums = np.array([shap_values[i].sum() for i in range(len(shap_values))])\n abs_sums = np.absolute(sums)\n cv = variation(abs_sums)\n except Exception as e:\n logger.warning(\"Could not compute feature importance CV with shap\")\n cv = 1\n if math.isnan(cv):\n cv = 1\n return cv\n
get_global_privacy_risk(dp, epsilon, n)
Calculates the global privacy risk by epsilon and the number of clients.
dp
Indicates if differential privacy is used or not.
epsilon
The epsilon value.
n
The number of clients in the scenario.
The global privacy risk.
def get_global_privacy_risk(dp, epsilon, n):\n \"\"\"\n Calculates the global privacy risk by epsilon and the number of clients.\n\n Args:\n dp (bool): Indicates if differential privacy is used or not.\n epsilon (int): The epsilon value.\n n (int): The number of clients in the scenario.\n\n Returns:\n float: The global privacy risk.\n \"\"\"\n\n if dp is True and isinstance(epsilon, numbers.Number):\n return 1 / (1 + (n - 1) * math.pow(e, -epsilon))\n else:\n return 1\n
get_map_value_score(score_key, score_map)
Finds the score by the score_key in the score_map and returns the value.
score_key
string
The key to look up in the score_map.
score_map
The score map defined in the eval_metrics.json file.
The score obtained in the score_map.
def get_map_value_score(score_key, score_map):\n \"\"\"\n Finds the score by the score_key in the score_map and returns the value.\n\n Args:\n score_key (string): The key to look up in the score_map.\n score_map (dict): The score map defined in the eval_metrics.json file.\n\n Returns:\n float: The score obtained in the score_map.\n \"\"\"\n score = 0\n if score_map is None:\n logger.warning(\"Score map is missing\")\n else:\n score = score_map[score_key]\n return score\n
get_mapped_score(score_key, score_map)
Finds the score by the score_key in the score_map.
The normalized score of [0, 1].
def get_mapped_score(score_key, score_map):\n \"\"\"\n Finds the score by the score_key in the score_map.\n\n Args:\n score_key (string): The key to look up in the score_map.\n score_map (dict): The score map defined in the eval_metrics.json file.\n\n Returns:\n float: The normalized score of [0, 1].\n \"\"\"\n score = 0\n if score_map is None:\n logger.warning(\"Score map is missing\")\n else:\n keys = [key for key, value in score_map.items()]\n scores = [value for key, value in score_map.items()]\n normalized_scores = get_normalized_scores(scores)\n normalized_score_map = dict(zip(keys, normalized_scores, strict=False))\n score = normalized_score_map.get(score_key, np.nan)\n\n return score\n
get_normalized_scores(scores)
Calculates the normalized scores of a list.
scores
The values that will be normalized.
The normalized list.
def get_normalized_scores(scores):\n \"\"\"\n Calculates the normalized scores of a list.\n\n Args:\n scores (list): The values that will be normalized.\n\n Returns:\n list: The normalized list.\n \"\"\"\n normalized = [(x - np.min(scores)) / (np.max(scores) - np.min(scores)) for x in scores]\n return normalized\n
get_range_score(value, ranges, direction='asc')
Maps the value to a range and gets the score by the range and direction.
value
The input score.
ranges
The ranges defined.
direction
Asc means the higher the range the higher the score, desc means otherwise.
'asc'
def get_range_score(value, ranges, direction=\"asc\"):\n \"\"\"\n Maps the value to a range and gets the score by the range and direction.\n\n Args:\n value (int): The input score.\n ranges (list): The ranges defined.\n direction (string): Asc means the higher the range the higher the score, desc means otherwise.\n\n Returns:\n float: The normalized score of [0, 1].\n \"\"\"\n\n if not (type(value) == int or type(value) == float):\n logger.warning(\"Input value is not a number\")\n logger.warning(f\"{value}\")\n return 0\n else:\n score = 0\n if ranges is None:\n logger.warning(\"Score ranges are missing\")\n else:\n total_bins = len(ranges) + 1\n bin = np.digitize(value, ranges, right=True)\n score = 1 - (bin / total_bins) if direction == \"desc\" else bin / total_bins\n return score\n
get_scaled_score(value, scale, direction)
Maps a score of a specific scale into the scale between zero and one.
int or float
The raw value of the metric.
scale
List containing the minimum and maximum value the value can fall in between.
def get_scaled_score(value, scale: list, direction: str):\n \"\"\"\n Maps a score of a specific scale into the scale between zero and one.\n\n Args:\n value (int or float): The raw value of the metric.\n scale (list): List containing the minimum and maximum value the value can fall in between.\n\n Returns:\n float: The normalized score of [0, 1].\n \"\"\"\n\n score = 0\n try:\n value_min, value_max = scale[0], scale[1]\n except Exception:\n logger.warning(\"Score minimum or score maximum is missing. The minimum has been set to 0 and the maximum to 1\")\n value_min, value_max = 0, 1\n if not value:\n logger.warning(\"Score value is missing. Set value to zero\")\n else:\n low, high = 0, 1\n if value >= value_max:\n score = 1\n elif value <= value_min:\n score = 0\n else:\n diff = value_max - value_min\n diffScale = high - low\n score = (float(value) - value_min) * (float(diffScale) / diff) + low\n if direction == \"desc\":\n score = high - score\n\n return score\n
get_true_score(value, direction)
Returns the negative of the value if direction is 'desc', otherwise returns value.
The score obtained.
def get_true_score(value, direction):\n \"\"\"\n Returns the negative of the value if direction is 'desc', otherwise returns value.\n\n Args:\n value (int): The input score.\n direction (string): Asc means the higher the range the higher the score, desc means otherwise.\n\n Returns:\n float: The score obtained.\n \"\"\"\n\n if value is True:\n return 1\n elif value is False:\n return 0\n else:\n if not (type(value) == int or type(value) == float):\n logger.warning(\"Input value is not a number\")\n logger.warning(f\"{value}.\")\n return 0\n else:\n if direction == \"desc\":\n return 1 - value\n else:\n return value\n
get_value(value)
Get the value of a metric.
The value of the metric.
def get_value(value):\n \"\"\"\n Get the value of a metric.\n\n Args:\n value (float): The value of the metric.\n\n Returns:\n float: The value of the metric.\n \"\"\"\n\n return value\n
stop_emissions_tracking_and_save(tracker, outdir, emissions_file, role, workload, sample_size=0)
Stops emissions tracking object from CodeCarbon and saves relevant information to emissions.csv file.
tracker
The emissions tracker object holding information.
outdir
The path of the output directory of the experiment.
emissions_file
The path to the emissions file.
role
Either client or server depending on the role.
workload
Either aggregation or training depending on the workload.
sample_size
The number of samples used for training, if aggregation 0.
0
def stop_emissions_tracking_and_save(\n tracker: EmissionsTracker,\n outdir: str,\n emissions_file: str,\n role: str,\n workload: str,\n sample_size: int = 0,\n):\n \"\"\"\n Stops emissions tracking object from CodeCarbon and saves relevant information to emissions.csv file.\n\n Args:\n tracker (object): The emissions tracker object holding information.\n outdir (str): The path of the output directory of the experiment.\n emissions_file (str): The path to the emissions file.\n role (str): Either client or server depending on the role.\n workload (str): Either aggregation or training depending on the workload.\n sample_size (int): The number of samples used for training, if aggregation 0.\n \"\"\"\n\n tracker.stop()\n\n emissions_file = os.path.join(outdir, emissions_file)\n\n if exists(emissions_file):\n df = pd.read_csv(emissions_file)\n else:\n df = pd.DataFrame(\n columns=[\n \"role\",\n \"energy_grid\",\n \"emissions\",\n \"workload\",\n \"CPU_model\",\n \"GPU_model\",\n ]\n )\n try:\n energy_grid = (tracker.final_emissions_data.emissions / tracker.final_emissions_data.energy_consumed) * 1000\n df = pd.concat(\n [\n df,\n pd.DataFrame({\n \"role\": role,\n \"energy_grid\": [energy_grid],\n \"emissions\": [tracker.final_emissions_data.emissions],\n \"workload\": workload,\n \"CPU_model\": tracker.final_emissions_data.cpu_model\n if tracker.final_emissions_data.cpu_model\n else \"None\",\n \"GPU_model\": tracker.final_emissions_data.gpu_model\n if tracker.final_emissions_data.gpu_model\n else \"None\",\n \"CPU_used\": True if tracker.final_emissions_data.cpu_energy else False,\n \"GPU_used\": True if tracker.final_emissions_data.gpu_energy else False,\n \"energy_consumed\": tracker.final_emissions_data.energy_consumed,\n \"sample_size\": sample_size,\n }),\n ],\n ignore_index=True,\n )\n df.to_csv(emissions_file, encoding=\"utf-8\", index=False)\n except Exception as e:\n logger.warning(e)\n
Factsheet
nebula/addons/trustworthiness/factsheet.py
class Factsheet:\n def __init__(self):\n \"\"\"\n Manager class to populate the FactSheet\n \"\"\"\n self.factsheet_file_nm = \"factsheet.json\"\n self.factsheet_template_file_nm = \"factsheet_template.json\"\n\n def populate_factsheet_pre_train(self, data, scenario_name):\n \"\"\"\n Populates the factsheet with values before the training.\n\n Args:\n data (dict): Contains the data from the scenario.\n scenario_name (string): The name of the scenario.\n \"\"\"\n\n factsheet_file = os.path.join(dirname, f\"files/{scenario_name}/{self.factsheet_file_nm}\")\n\n factsheet_template = os.path.join(dirname, f\"configs/{self.factsheet_template_file_nm}\")\n\n if not os.path.exists(factsheet_file):\n shutil.copyfile(factsheet_template, factsheet_file)\n\n with open(factsheet_file, \"r+\") as f:\n factsheet = {}\n\n try:\n factsheet = json.load(f)\n\n if data is not None:\n logger.info(\"FactSheet: Populating factsheet with pre training metrics\")\n\n federation = data[\"federation\"]\n n_nodes = int(data[\"n_nodes\"])\n dataset = data[\"dataset\"]\n algorithm = data[\"model\"]\n aggregation_algorithm = data[\"agg_algorithm\"]\n n_rounds = int(data[\"rounds\"])\n attack = data[\"attacks\"]\n poisoned_node_percent = int(data[\"poisoned_node_percent\"])\n poisoned_sample_percent = int(data[\"poisoned_sample_percent\"])\n poisoned_noise_percent = int(data[\"poisoned_noise_percent\"])\n with_reputation = data[\"with_reputation\"]\n is_dynamic_topology = data[\"is_dynamic_topology\"]\n is_dynamic_aggregation = data[\"is_dynamic_aggregation\"]\n target_aggregation = data[\"target_aggregation\"]\n\n if attack != \"No Attack\" and with_reputation == True and is_dynamic_aggregation == True:\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. In addition, the type of attack used against the clients is {attack}, where the percentage of attacked nodes is {poisoned_node_percent}, the percentage of attacked samples of each node is {poisoned_sample_percent}, and the percent of poisoned noise is {poisoned_noise_percent}. A reputation-based defence with a dynamic aggregation based on the aggregation algorithm {target_aggregation} is used, and the trustworthiness of the project is desired.\"\n\n elif attack != \"No Attack\" and with_reputation == True and is_dynamic_topology == True:\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. In addition, the type of attack used against the clients is {attack}, where the percentage of attacked nodes is {poisoned_node_percent}, the percentage of attacked samples of each node is {poisoned_sample_percent}, and the percent of poisoned noise is {poisoned_noise_percent}. A reputation-based defence with a dynamic topology is used, and the trustworthiness of the project is desired.\"\n\n elif attack != \"No Attack\" and with_reputation == False:\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. In addition, the type of attack used against the clients is {attack}, where the percentage of attacked nodes is {poisoned_node_percent}, the percentage of attacked samples of each node is {poisoned_sample_percent}, and the percent of poisoned noise is {poisoned_noise_percent}. No defence mechanism is used, and the trustworthiness of the project is desired.\"\n\n elif attack == \"No Attack\":\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. No attacks against clients are used, and the trustworthiness of the project is desired.\"\n\n # Set project specifications\n factsheet[\"project\"][\"overview\"] = data[\"scenario_title\"]\n factsheet[\"project\"][\"purpose\"] = data[\"scenario_description\"]\n factsheet[\"project\"][\"background\"] = background\n\n # Set data specifications\n factsheet[\"data\"][\"provenance\"] = data[\"dataset\"]\n factsheet[\"data\"][\"preprocessing\"] = data[\"topology\"]\n\n # Set participants\n factsheet[\"participants\"][\"client_num\"] = data[\"n_nodes\"] or \"\"\n factsheet[\"participants\"][\"sample_client_rate\"] = 1\n factsheet[\"participants\"][\"client_selector\"] = \"\"\n\n # Set configuration\n factsheet[\"configuration\"][\"aggregation_algorithm\"] = data[\"agg_algorithm\"] or \"\"\n factsheet[\"configuration\"][\"training_model\"] = data[\"model\"] or \"\"\n factsheet[\"configuration\"][\"personalization\"] = False\n factsheet[\"configuration\"][\"visualization\"] = True\n factsheet[\"configuration\"][\"total_round_num\"] = n_rounds\n\n if poisoned_noise_percent != 0:\n factsheet[\"configuration\"][\"differential_privacy\"] = True\n factsheet[\"configuration\"][\"dp_epsilon\"] = poisoned_noise_percent\n else:\n factsheet[\"configuration\"][\"differential_privacy\"] = False\n factsheet[\"configuration\"][\"dp_epsilon\"] = \"\"\n\n if dataset == \"MNIST\" and algorithm == \"MLP\":\n model = MNISTModelMLP()\n elif dataset == \"MNIST\" and algorithm == \"CNN\":\n model = MNISTModelCNN()\n elif dataset == \"Syscall\" and algorithm == \"MLP\":\n model = SyscallModelMLP()\n else:\n model = CIFAR10ModelCNN()\n\n factsheet[\"configuration\"][\"learning_rate\"] = model.get_learning_rate()\n factsheet[\"configuration\"][\"trainable_param_num\"] = model.count_parameters()\n factsheet[\"configuration\"][\"local_update_steps\"] = 1\n\n except JSONDecodeError as e:\n logger.warning(f\"{factsheet_file} is invalid\")\n logger.error(e)\n\n f.seek(0)\n f.truncate()\n json.dump(factsheet, f, indent=4)\n f.close()\n\n def populate_factsheet_post_train(self, scenario):\n \"\"\"\n Populates the factsheet with values after the training.\n\n Args:\n scenario (object): The scenario object.\n \"\"\"\n scenario_name = scenario[0]\n\n factsheet_file = os.path.join(dirname, f\"files/{scenario_name}/{self.factsheet_file_nm}\")\n\n logger.info(\"FactSheet: Populating factsheet with post training metrics\")\n\n with open(factsheet_file, \"r+\") as f:\n factsheet = {}\n try:\n factsheet = json.load(f)\n\n dataset = factsheet[\"data\"][\"provenance\"]\n model = factsheet[\"configuration\"][\"training_model\"]\n\n actual_dir = os.getcwd()\n files_dir = f\"{actual_dir}/trustworthiness/files/{scenario_name}\"\n data_dir = f\"{actual_dir}/trustworthiness/data/\"\n\n models_files = glob.glob(os.path.join(files_dir, \"*final_model*\"))\n bytes_sent_files = glob.glob(os.path.join(files_dir, \"*bytes_sent*\"))\n bytes_recv_files = glob.glob(os.path.join(files_dir, \"*bytes_recv*\"))\n loss_files = glob.glob(os.path.join(files_dir, \"*loss*\"))\n accuracy_files = glob.glob(os.path.join(files_dir, \"*accuracy*\"))\n dataloaders_files = glob.glob(os.path.join(files_dir, \"*train_loader*\"))\n test_dataloader_file = f\"{files_dir}/participant_1_test_loader.pk\"\n train_model_file = f\"{files_dir}/participant_1_train_model.pk\"\n emissions_file = os.path.join(files_dir, \"emissions.csv\")\n\n # Entropy\n i = 0\n for file in dataloaders_files:\n with open(file, \"rb\") as file:\n dataloader = pickle.load(file)\n get_entropy(i, scenario_name, dataloader)\n i += 1\n\n with open(f\"{files_dir}/entropy.json\") as file:\n entropy_distribution = json.load(file)\n\n values = np.array(list(entropy_distribution.values()))\n\n normalized_values = (values - np.min(values)) / (np.max(values) - np.min(values))\n\n avg_entropy = np.mean(normalized_values)\n\n factsheet[\"data\"][\"avg_entropy\"] = avg_entropy\n\n # Set performance data\n result_avg_loss_accuracy = get_avg_loss_accuracy(loss_files, accuracy_files)\n factsheet[\"performance\"][\"test_loss_avg\"] = result_avg_loss_accuracy[0]\n factsheet[\"performance\"][\"test_acc_avg\"] = result_avg_loss_accuracy[1]\n test_acc_cv = get_cv(std=result_avg_loss_accuracy[2], mean=result_avg_loss_accuracy[1])\n factsheet[\"fairness\"][\"test_acc_cv\"] = 1 if test_acc_cv > 1 else test_acc_cv\n\n factsheet[\"system\"][\"avg_time_minutes\"] = get_elapsed_time(scenario)\n factsheet[\"system\"][\"avg_model_size\"] = get_bytes_models(models_files)\n\n result_bytes_sent_recv = get_bytes_sent_recv(bytes_sent_files, bytes_recv_files)\n factsheet[\"system\"][\"total_upload_bytes\"] = result_bytes_sent_recv[0]\n factsheet[\"system\"][\"total_download_bytes\"] = result_bytes_sent_recv[1]\n factsheet[\"system\"][\"avg_upload_bytes\"] = result_bytes_sent_recv[2]\n factsheet[\"system\"][\"avg_download_bytes\"] = result_bytes_sent_recv[3]\n\n factsheet[\"fairness\"][\"selection_cv\"] = 1\n\n count_class_samples(scenario_name, dataloaders_files)\n\n with open(f\"{files_dir}/count_class.json\") as file:\n class_distribution = json.load(file)\n\n class_samples_sizes = [x for x in class_distribution.values()]\n class_imbalance = get_cv(list=class_samples_sizes)\n factsheet[\"fairness\"][\"class_imbalance\"] = 1 if class_imbalance > 1 else class_imbalance\n\n with open(train_model_file, \"rb\") as file:\n lightning_model = pickle.load(file)\n\n if dataset == \"MNIST\" and model == \"MLP\":\n pytorch_model = MNISTTorchModelMLP()\n elif dataset == \"MNIST\" and model == \"CNN\":\n pytorch_model = MNISTTorchModelCNN()\n elif dataset == \"Syscall\" and model == \"MLP\":\n pytorch_model = SyscallTorchModelMLP()\n else:\n pytorch_model = CIFAR10TorchModelCNN()\n\n pytorch_model.load_state_dict(lightning_model.state_dict())\n\n with open(test_dataloader_file, \"rb\") as file:\n test_dataloader = pickle.load(file)\n\n test_sample = next(iter(test_dataloader))\n\n lr = factsheet[\"configuration\"][\"learning_rate\"]\n value_clever = get_clever_score(pytorch_model, test_sample, 10, lr)\n\n factsheet[\"performance\"][\"test_clever\"] = 1 if value_clever > 1 else value_clever\n\n feature_importance = get_feature_importance_cv(pytorch_model, test_sample)\n\n factsheet[\"performance\"][\"test_feature_importance_cv\"] = (\n 1 if feature_importance > 1 else feature_importance\n )\n\n # Set emissions metrics\n emissions = None if emissions_file is None else read_csv(emissions_file)\n if emissions is not None:\n logger.info(\"FactSheet: Populating emissions\")\n cpu_spez_df = pd.read_csv(os.path.join(data_dir, \"CPU_benchmarks_v4.csv\"), header=0)\n emissions[\"CPU_model\"] = (\n emissions[\"CPU_model\"].astype(str).str.replace(r\"\\([^)]*\\)\", \"\", regex=True)\n )\n emissions[\"CPU_model\"] = emissions[\"CPU_model\"].astype(str).str.replace(r\" CPU\", \"\", regex=True)\n emissions[\"GPU_model\"] = emissions[\"GPU_model\"].astype(str).str.replace(r\"[0-9] x \", \"\", regex=True)\n emissions = pd.merge(\n emissions,\n cpu_spez_df[[\"cpuName\", \"powerPerf\"]],\n left_on=\"CPU_model\",\n right_on=\"cpuName\",\n how=\"left\",\n )\n gpu_spez_df = pd.read_csv(os.path.join(data_dir, \"GPU_benchmarks_v7.csv\"), header=0)\n emissions = pd.merge(\n emissions,\n gpu_spez_df[[\"gpuName\", \"powerPerformance\"]],\n left_on=\"GPU_model\",\n right_on=\"gpuName\",\n how=\"left\",\n )\n\n emissions.drop(\"cpuName\", axis=1, inplace=True)\n emissions.drop(\"gpuName\", axis=1, inplace=True)\n emissions[\"powerPerf\"] = emissions[\"powerPerf\"].astype(float)\n emissions[\"powerPerformance\"] = emissions[\"powerPerformance\"].astype(float)\n client_emissions = emissions.loc[emissions[\"role\"] == \"client\"]\n client_avg_carbon_intensity = round(client_emissions[\"energy_grid\"].mean(), 2)\n factsheet[\"sustainability\"][\"avg_carbon_intensity_clients\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"avg_carbon_intensity_clients\"],\n client_avg_carbon_intensity,\n \"\",\n )\n factsheet[\"sustainability\"][\"emissions_training\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_training\"],\n client_emissions[\"emissions\"].sum(),\n \"\",\n )\n factsheet[\"participants\"][\"avg_dataset_size\"] = check_field_filled(\n factsheet,\n [\"participants\", \"avg_dataset_size\"],\n client_emissions[\"sample_size\"].mean(),\n \"\",\n )\n\n server_emissions = emissions.loc[emissions[\"role\"] == \"server\"]\n server_avg_carbon_intensity = round(server_emissions[\"energy_grid\"].mean(), 2)\n factsheet[\"sustainability\"][\"avg_carbon_intensity_server\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"avg_carbon_intensity_server\"],\n server_avg_carbon_intensity,\n \"\",\n )\n factsheet[\"sustainability\"][\"emissions_aggregation\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_aggregation\"],\n server_emissions[\"emissions\"].sum(),\n \"\",\n )\n GPU_powerperf = (server_emissions.loc[server_emissions[\"GPU_used\"] == True])[\"powerPerformance\"]\n CPU_powerperf = (server_emissions.loc[server_emissions[\"CPU_used\"] == True])[\"powerPerf\"]\n server_power_performance = round(pd.concat([GPU_powerperf, CPU_powerperf]).mean(), 2)\n factsheet[\"sustainability\"][\"avg_power_performance_server\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"avg_power_performance_server\"],\n server_power_performance,\n \"\",\n )\n\n GPU_powerperf = (client_emissions.loc[client_emissions[\"GPU_used\"] == True])[\"powerPerformance\"]\n CPU_powerperf = (client_emissions.loc[client_emissions[\"CPU_used\"] == True])[\"powerPerf\"]\n clients_power_performance = round(pd.concat([GPU_powerperf, CPU_powerperf]).mean(), 2)\n factsheet[\"sustainability\"][\"avg_power_performance_clients\"] = clients_power_performance\n\n factsheet[\"sustainability\"][\"emissions_communication_uplink\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_communication_uplink\"],\n factsheet[\"system\"][\"total_upload_bytes\"]\n * 2.24e-10\n * factsheet[\"sustainability\"][\"avg_carbon_intensity_clients\"],\n \"\",\n )\n factsheet[\"sustainability\"][\"emissions_communication_downlink\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_communication_downlink\"],\n factsheet[\"system\"][\"total_download_bytes\"]\n * 2.24e-10\n * factsheet[\"sustainability\"][\"avg_carbon_intensity_server\"],\n \"\",\n )\n\n except JSONDecodeError as e:\n logger.warning(f\"{factsheet_file} is invalid\")\n logger.error(e)\n\n f.seek(0)\n f.truncate()\n json.dump(factsheet, f, indent=4)\n f.close()\n
__init__()
Manager class to populate the FactSheet
def __init__(self):\n \"\"\"\n Manager class to populate the FactSheet\n \"\"\"\n self.factsheet_file_nm = \"factsheet.json\"\n self.factsheet_template_file_nm = \"factsheet_template.json\"\n
populate_factsheet_post_train(scenario)
Populates the factsheet with values after the training.
The scenario object.
def populate_factsheet_post_train(self, scenario):\n \"\"\"\n Populates the factsheet with values after the training.\n\n Args:\n scenario (object): The scenario object.\n \"\"\"\n scenario_name = scenario[0]\n\n factsheet_file = os.path.join(dirname, f\"files/{scenario_name}/{self.factsheet_file_nm}\")\n\n logger.info(\"FactSheet: Populating factsheet with post training metrics\")\n\n with open(factsheet_file, \"r+\") as f:\n factsheet = {}\n try:\n factsheet = json.load(f)\n\n dataset = factsheet[\"data\"][\"provenance\"]\n model = factsheet[\"configuration\"][\"training_model\"]\n\n actual_dir = os.getcwd()\n files_dir = f\"{actual_dir}/trustworthiness/files/{scenario_name}\"\n data_dir = f\"{actual_dir}/trustworthiness/data/\"\n\n models_files = glob.glob(os.path.join(files_dir, \"*final_model*\"))\n bytes_sent_files = glob.glob(os.path.join(files_dir, \"*bytes_sent*\"))\n bytes_recv_files = glob.glob(os.path.join(files_dir, \"*bytes_recv*\"))\n loss_files = glob.glob(os.path.join(files_dir, \"*loss*\"))\n accuracy_files = glob.glob(os.path.join(files_dir, \"*accuracy*\"))\n dataloaders_files = glob.glob(os.path.join(files_dir, \"*train_loader*\"))\n test_dataloader_file = f\"{files_dir}/participant_1_test_loader.pk\"\n train_model_file = f\"{files_dir}/participant_1_train_model.pk\"\n emissions_file = os.path.join(files_dir, \"emissions.csv\")\n\n # Entropy\n i = 0\n for file in dataloaders_files:\n with open(file, \"rb\") as file:\n dataloader = pickle.load(file)\n get_entropy(i, scenario_name, dataloader)\n i += 1\n\n with open(f\"{files_dir}/entropy.json\") as file:\n entropy_distribution = json.load(file)\n\n values = np.array(list(entropy_distribution.values()))\n\n normalized_values = (values - np.min(values)) / (np.max(values) - np.min(values))\n\n avg_entropy = np.mean(normalized_values)\n\n factsheet[\"data\"][\"avg_entropy\"] = avg_entropy\n\n # Set performance data\n result_avg_loss_accuracy = get_avg_loss_accuracy(loss_files, accuracy_files)\n factsheet[\"performance\"][\"test_loss_avg\"] = result_avg_loss_accuracy[0]\n factsheet[\"performance\"][\"test_acc_avg\"] = result_avg_loss_accuracy[1]\n test_acc_cv = get_cv(std=result_avg_loss_accuracy[2], mean=result_avg_loss_accuracy[1])\n factsheet[\"fairness\"][\"test_acc_cv\"] = 1 if test_acc_cv > 1 else test_acc_cv\n\n factsheet[\"system\"][\"avg_time_minutes\"] = get_elapsed_time(scenario)\n factsheet[\"system\"][\"avg_model_size\"] = get_bytes_models(models_files)\n\n result_bytes_sent_recv = get_bytes_sent_recv(bytes_sent_files, bytes_recv_files)\n factsheet[\"system\"][\"total_upload_bytes\"] = result_bytes_sent_recv[0]\n factsheet[\"system\"][\"total_download_bytes\"] = result_bytes_sent_recv[1]\n factsheet[\"system\"][\"avg_upload_bytes\"] = result_bytes_sent_recv[2]\n factsheet[\"system\"][\"avg_download_bytes\"] = result_bytes_sent_recv[3]\n\n factsheet[\"fairness\"][\"selection_cv\"] = 1\n\n count_class_samples(scenario_name, dataloaders_files)\n\n with open(f\"{files_dir}/count_class.json\") as file:\n class_distribution = json.load(file)\n\n class_samples_sizes = [x for x in class_distribution.values()]\n class_imbalance = get_cv(list=class_samples_sizes)\n factsheet[\"fairness\"][\"class_imbalance\"] = 1 if class_imbalance > 1 else class_imbalance\n\n with open(train_model_file, \"rb\") as file:\n lightning_model = pickle.load(file)\n\n if dataset == \"MNIST\" and model == \"MLP\":\n pytorch_model = MNISTTorchModelMLP()\n elif dataset == \"MNIST\" and model == \"CNN\":\n pytorch_model = MNISTTorchModelCNN()\n elif dataset == \"Syscall\" and model == \"MLP\":\n pytorch_model = SyscallTorchModelMLP()\n else:\n pytorch_model = CIFAR10TorchModelCNN()\n\n pytorch_model.load_state_dict(lightning_model.state_dict())\n\n with open(test_dataloader_file, \"rb\") as file:\n test_dataloader = pickle.load(file)\n\n test_sample = next(iter(test_dataloader))\n\n lr = factsheet[\"configuration\"][\"learning_rate\"]\n value_clever = get_clever_score(pytorch_model, test_sample, 10, lr)\n\n factsheet[\"performance\"][\"test_clever\"] = 1 if value_clever > 1 else value_clever\n\n feature_importance = get_feature_importance_cv(pytorch_model, test_sample)\n\n factsheet[\"performance\"][\"test_feature_importance_cv\"] = (\n 1 if feature_importance > 1 else feature_importance\n )\n\n # Set emissions metrics\n emissions = None if emissions_file is None else read_csv(emissions_file)\n if emissions is not None:\n logger.info(\"FactSheet: Populating emissions\")\n cpu_spez_df = pd.read_csv(os.path.join(data_dir, \"CPU_benchmarks_v4.csv\"), header=0)\n emissions[\"CPU_model\"] = (\n emissions[\"CPU_model\"].astype(str).str.replace(r\"\\([^)]*\\)\", \"\", regex=True)\n )\n emissions[\"CPU_model\"] = emissions[\"CPU_model\"].astype(str).str.replace(r\" CPU\", \"\", regex=True)\n emissions[\"GPU_model\"] = emissions[\"GPU_model\"].astype(str).str.replace(r\"[0-9] x \", \"\", regex=True)\n emissions = pd.merge(\n emissions,\n cpu_spez_df[[\"cpuName\", \"powerPerf\"]],\n left_on=\"CPU_model\",\n right_on=\"cpuName\",\n how=\"left\",\n )\n gpu_spez_df = pd.read_csv(os.path.join(data_dir, \"GPU_benchmarks_v7.csv\"), header=0)\n emissions = pd.merge(\n emissions,\n gpu_spez_df[[\"gpuName\", \"powerPerformance\"]],\n left_on=\"GPU_model\",\n right_on=\"gpuName\",\n how=\"left\",\n )\n\n emissions.drop(\"cpuName\", axis=1, inplace=True)\n emissions.drop(\"gpuName\", axis=1, inplace=True)\n emissions[\"powerPerf\"] = emissions[\"powerPerf\"].astype(float)\n emissions[\"powerPerformance\"] = emissions[\"powerPerformance\"].astype(float)\n client_emissions = emissions.loc[emissions[\"role\"] == \"client\"]\n client_avg_carbon_intensity = round(client_emissions[\"energy_grid\"].mean(), 2)\n factsheet[\"sustainability\"][\"avg_carbon_intensity_clients\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"avg_carbon_intensity_clients\"],\n client_avg_carbon_intensity,\n \"\",\n )\n factsheet[\"sustainability\"][\"emissions_training\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_training\"],\n client_emissions[\"emissions\"].sum(),\n \"\",\n )\n factsheet[\"participants\"][\"avg_dataset_size\"] = check_field_filled(\n factsheet,\n [\"participants\", \"avg_dataset_size\"],\n client_emissions[\"sample_size\"].mean(),\n \"\",\n )\n\n server_emissions = emissions.loc[emissions[\"role\"] == \"server\"]\n server_avg_carbon_intensity = round(server_emissions[\"energy_grid\"].mean(), 2)\n factsheet[\"sustainability\"][\"avg_carbon_intensity_server\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"avg_carbon_intensity_server\"],\n server_avg_carbon_intensity,\n \"\",\n )\n factsheet[\"sustainability\"][\"emissions_aggregation\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_aggregation\"],\n server_emissions[\"emissions\"].sum(),\n \"\",\n )\n GPU_powerperf = (server_emissions.loc[server_emissions[\"GPU_used\"] == True])[\"powerPerformance\"]\n CPU_powerperf = (server_emissions.loc[server_emissions[\"CPU_used\"] == True])[\"powerPerf\"]\n server_power_performance = round(pd.concat([GPU_powerperf, CPU_powerperf]).mean(), 2)\n factsheet[\"sustainability\"][\"avg_power_performance_server\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"avg_power_performance_server\"],\n server_power_performance,\n \"\",\n )\n\n GPU_powerperf = (client_emissions.loc[client_emissions[\"GPU_used\"] == True])[\"powerPerformance\"]\n CPU_powerperf = (client_emissions.loc[client_emissions[\"CPU_used\"] == True])[\"powerPerf\"]\n clients_power_performance = round(pd.concat([GPU_powerperf, CPU_powerperf]).mean(), 2)\n factsheet[\"sustainability\"][\"avg_power_performance_clients\"] = clients_power_performance\n\n factsheet[\"sustainability\"][\"emissions_communication_uplink\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_communication_uplink\"],\n factsheet[\"system\"][\"total_upload_bytes\"]\n * 2.24e-10\n * factsheet[\"sustainability\"][\"avg_carbon_intensity_clients\"],\n \"\",\n )\n factsheet[\"sustainability\"][\"emissions_communication_downlink\"] = check_field_filled(\n factsheet,\n [\"sustainability\", \"emissions_communication_downlink\"],\n factsheet[\"system\"][\"total_download_bytes\"]\n * 2.24e-10\n * factsheet[\"sustainability\"][\"avg_carbon_intensity_server\"],\n \"\",\n )\n\n except JSONDecodeError as e:\n logger.warning(f\"{factsheet_file} is invalid\")\n logger.error(e)\n\n f.seek(0)\n f.truncate()\n json.dump(factsheet, f, indent=4)\n f.close()\n
populate_factsheet_pre_train(data, scenario_name)
Populates the factsheet with values before the training.
data
Contains the data from the scenario.
scenario_name
The name of the scenario.
def populate_factsheet_pre_train(self, data, scenario_name):\n \"\"\"\n Populates the factsheet with values before the training.\n\n Args:\n data (dict): Contains the data from the scenario.\n scenario_name (string): The name of the scenario.\n \"\"\"\n\n factsheet_file = os.path.join(dirname, f\"files/{scenario_name}/{self.factsheet_file_nm}\")\n\n factsheet_template = os.path.join(dirname, f\"configs/{self.factsheet_template_file_nm}\")\n\n if not os.path.exists(factsheet_file):\n shutil.copyfile(factsheet_template, factsheet_file)\n\n with open(factsheet_file, \"r+\") as f:\n factsheet = {}\n\n try:\n factsheet = json.load(f)\n\n if data is not None:\n logger.info(\"FactSheet: Populating factsheet with pre training metrics\")\n\n federation = data[\"federation\"]\n n_nodes = int(data[\"n_nodes\"])\n dataset = data[\"dataset\"]\n algorithm = data[\"model\"]\n aggregation_algorithm = data[\"agg_algorithm\"]\n n_rounds = int(data[\"rounds\"])\n attack = data[\"attacks\"]\n poisoned_node_percent = int(data[\"poisoned_node_percent\"])\n poisoned_sample_percent = int(data[\"poisoned_sample_percent\"])\n poisoned_noise_percent = int(data[\"poisoned_noise_percent\"])\n with_reputation = data[\"with_reputation\"]\n is_dynamic_topology = data[\"is_dynamic_topology\"]\n is_dynamic_aggregation = data[\"is_dynamic_aggregation\"]\n target_aggregation = data[\"target_aggregation\"]\n\n if attack != \"No Attack\" and with_reputation == True and is_dynamic_aggregation == True:\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. In addition, the type of attack used against the clients is {attack}, where the percentage of attacked nodes is {poisoned_node_percent}, the percentage of attacked samples of each node is {poisoned_sample_percent}, and the percent of poisoned noise is {poisoned_noise_percent}. A reputation-based defence with a dynamic aggregation based on the aggregation algorithm {target_aggregation} is used, and the trustworthiness of the project is desired.\"\n\n elif attack != \"No Attack\" and with_reputation == True and is_dynamic_topology == True:\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. In addition, the type of attack used against the clients is {attack}, where the percentage of attacked nodes is {poisoned_node_percent}, the percentage of attacked samples of each node is {poisoned_sample_percent}, and the percent of poisoned noise is {poisoned_noise_percent}. A reputation-based defence with a dynamic topology is used, and the trustworthiness of the project is desired.\"\n\n elif attack != \"No Attack\" and with_reputation == False:\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. In addition, the type of attack used against the clients is {attack}, where the percentage of attacked nodes is {poisoned_node_percent}, the percentage of attacked samples of each node is {poisoned_sample_percent}, and the percent of poisoned noise is {poisoned_noise_percent}. No defence mechanism is used, and the trustworthiness of the project is desired.\"\n\n elif attack == \"No Attack\":\n background = f\"For the project setup, the most important aspects are the following: The federation architecture is {federation}, involving {n_nodes} clients, the dataset used is {dataset}, the learning algorithm is {algorithm}, the aggregation algorithm is {aggregation_algorithm} and the number of rounds is {n_rounds}. No attacks against clients are used, and the trustworthiness of the project is desired.\"\n\n # Set project specifications\n factsheet[\"project\"][\"overview\"] = data[\"scenario_title\"]\n factsheet[\"project\"][\"purpose\"] = data[\"scenario_description\"]\n factsheet[\"project\"][\"background\"] = background\n\n # Set data specifications\n factsheet[\"data\"][\"provenance\"] = data[\"dataset\"]\n factsheet[\"data\"][\"preprocessing\"] = data[\"topology\"]\n\n # Set participants\n factsheet[\"participants\"][\"client_num\"] = data[\"n_nodes\"] or \"\"\n factsheet[\"participants\"][\"sample_client_rate\"] = 1\n factsheet[\"participants\"][\"client_selector\"] = \"\"\n\n # Set configuration\n factsheet[\"configuration\"][\"aggregation_algorithm\"] = data[\"agg_algorithm\"] or \"\"\n factsheet[\"configuration\"][\"training_model\"] = data[\"model\"] or \"\"\n factsheet[\"configuration\"][\"personalization\"] = False\n factsheet[\"configuration\"][\"visualization\"] = True\n factsheet[\"configuration\"][\"total_round_num\"] = n_rounds\n\n if poisoned_noise_percent != 0:\n factsheet[\"configuration\"][\"differential_privacy\"] = True\n factsheet[\"configuration\"][\"dp_epsilon\"] = poisoned_noise_percent\n else:\n factsheet[\"configuration\"][\"differential_privacy\"] = False\n factsheet[\"configuration\"][\"dp_epsilon\"] = \"\"\n\n if dataset == \"MNIST\" and algorithm == \"MLP\":\n model = MNISTModelMLP()\n elif dataset == \"MNIST\" and algorithm == \"CNN\":\n model = MNISTModelCNN()\n elif dataset == \"Syscall\" and algorithm == \"MLP\":\n model = SyscallModelMLP()\n else:\n model = CIFAR10ModelCNN()\n\n factsheet[\"configuration\"][\"learning_rate\"] = model.get_learning_rate()\n factsheet[\"configuration\"][\"trainable_param_num\"] = model.count_parameters()\n factsheet[\"configuration\"][\"local_update_steps\"] = 1\n\n except JSONDecodeError as e:\n logger.warning(f\"{factsheet_file} is invalid\")\n logger.error(e)\n\n f.seek(0)\n f.truncate()\n json.dump(factsheet, f, indent=4)\n f.close()\n
TrustMetricManager
Manager class to help store the output directory and handle calls from the FL framework.
nebula/addons/trustworthiness/metric.py
class TrustMetricManager:\n \"\"\"\n Manager class to help store the output directory and handle calls from the FL framework.\n \"\"\"\n\n def __init__(self):\n self.factsheet_file_nm = \"factsheet.json\"\n self.eval_metrics_file_nm = \"eval_metrics.json\"\n self.nebula_trust_results_nm = \"nebula_trust_results.json\"\n\n def evaluate(self, scenario, weights, use_weights=False):\n \"\"\"\n Evaluates the trustworthiness score.\n\n Args:\n scenario (object): The scenario in whith the trustworthiness will be calculated.\n weights (dict): The desired weghts of the pillars.\n use_weights (bool): True to turn on the weights in the metric config file, default to False.\n \"\"\"\n # Get scenario name\n scenario_name = scenario[0]\n factsheet_file = os.path.join(dirname, f\"files/{scenario_name}/{self.factsheet_file_nm}\")\n metrics_cfg_file = os.path.join(dirname, f\"configs/{self.eval_metrics_file_nm}\")\n results_file = os.path.join(dirname, f\"files/{scenario_name}/{self.nebula_trust_results_nm}\")\n\n if not os.path.exists(factsheet_file):\n logger.error(f\"{factsheet_file} is missing! Please check documentation.\")\n return\n\n if not os.path.exists(metrics_cfg_file):\n logger.error(f\"{metrics_cfg_file} is missing! Please check documentation.\")\n return\n\n with open(factsheet_file) as f, open(metrics_cfg_file) as m:\n factsheet = json.load(f)\n metrics_cfg = json.load(m)\n metrics = metrics_cfg.items()\n input_docs = {\"factsheet\": factsheet}\n\n result_json = {\"trust_score\": 0, \"pillars\": []}\n final_score = 0\n result_print = []\n for key, value in metrics:\n pillar = TrustPillar(key, value, input_docs, use_weights)\n score, result = pillar.evaluate()\n weight = weights.get(key)\n final_score += weight * score\n result_print.append([key, score])\n result_json[\"pillars\"].append(result)\n final_score = round(final_score, 2)\n result_json[\"trust_score\"] = final_score\n write_results_json(results_file, result_json)\n
evaluate(scenario, weights, use_weights=False)
Evaluates the trustworthiness score.
The scenario in whith the trustworthiness will be calculated.
weights
The desired weghts of the pillars.
use_weights
True to turn on the weights in the metric config file, default to False.
False
def evaluate(self, scenario, weights, use_weights=False):\n \"\"\"\n Evaluates the trustworthiness score.\n\n Args:\n scenario (object): The scenario in whith the trustworthiness will be calculated.\n weights (dict): The desired weghts of the pillars.\n use_weights (bool): True to turn on the weights in the metric config file, default to False.\n \"\"\"\n # Get scenario name\n scenario_name = scenario[0]\n factsheet_file = os.path.join(dirname, f\"files/{scenario_name}/{self.factsheet_file_nm}\")\n metrics_cfg_file = os.path.join(dirname, f\"configs/{self.eval_metrics_file_nm}\")\n results_file = os.path.join(dirname, f\"files/{scenario_name}/{self.nebula_trust_results_nm}\")\n\n if not os.path.exists(factsheet_file):\n logger.error(f\"{factsheet_file} is missing! Please check documentation.\")\n return\n\n if not os.path.exists(metrics_cfg_file):\n logger.error(f\"{metrics_cfg_file} is missing! Please check documentation.\")\n return\n\n with open(factsheet_file) as f, open(metrics_cfg_file) as m:\n factsheet = json.load(f)\n metrics_cfg = json.load(m)\n metrics = metrics_cfg.items()\n input_docs = {\"factsheet\": factsheet}\n\n result_json = {\"trust_score\": 0, \"pillars\": []}\n final_score = 0\n result_print = []\n for key, value in metrics:\n pillar = TrustPillar(key, value, input_docs, use_weights)\n score, result = pillar.evaluate()\n weight = weights.get(key)\n final_score += weight * score\n result_print.append([key, score])\n result_json[\"pillars\"].append(result)\n final_score = round(final_score, 2)\n result_json[\"trust_score\"] = final_score\n write_results_json(results_file, result_json)\n
TrustPillar
Class to represent a trust pillar.
name
Name of the pillar.
metrics
Metric definitions for the pillar.
input_docs
Input documents.
True to turn on the weights in the metric config file.
nebula/addons/trustworthiness/pillar.py
class TrustPillar:\n \"\"\"\n Class to represent a trust pillar.\n\n Args:\n name (string): Name of the pillar.\n metrics (dict): Metric definitions for the pillar.\n input_docs (dict): Input documents.\n use_weights (bool): True to turn on the weights in the metric config file.\n\n \"\"\"\n\n def __init__(self, name, metrics, input_docs, use_weights=False):\n self.name = name\n self.input_docs = input_docs\n self.metrics = metrics\n self.result = []\n self.use_weights = use_weights\n\n def evaluate(self):\n \"\"\"\n Evaluate the trust score for the pillar.\n\n Returns:\n float: Score of [0, 1].\n \"\"\"\n score = 0\n avg_weight = 1 / len(self.metrics)\n for key, value in self.metrics.items():\n weight = value.get(\"weight\", avg_weight) if self.use_weights else avg_weight\n score += weight * self.get_notion_score(key, value.get(\"metrics\"))\n score = round(score, 2)\n return score, {self.name: {\"score\": score, \"notions\": self.result}}\n\n def get_notion_score(self, name, metrics):\n \"\"\"\n Evaluate the trust score for the notion.\n\n Args:\n name (string): Name of the notion.\n metrics (list): Metrics definitions of the notion.\n\n Returns:\n float: Score of [0, 1].\n \"\"\"\n\n notion_score = 0\n avg_weight = 1 / len(metrics)\n metrics_result = []\n for key, value in metrics.items():\n metric_score = self.get_metric_score(metrics_result, key, value)\n weight = value.get(\"weight\", avg_weight) if self.use_weights else avg_weight\n notion_score += weight * float(metric_score)\n self.result.append({name: {\"score\": notion_score, \"metrics\": metrics_result}})\n return notion_score\n\n def get_metric_score(self, result, name, metric):\n \"\"\"\n Evaluate the trust score for the metric.\n\n Args:\n result (object): The result object\n name (string): Name of the metric.\n metrics (dict): The metric definition.\n\n Returns:\n float: Score of [0, 1].\n \"\"\"\n\n score = 0\n try:\n input_value = get_input_value(self.input_docs, metric.get(\"inputs\"), metric.get(\"operation\"))\n\n score_type = metric.get(\"type\")\n if input_value is None:\n logger.warning(f\"{name} input value is null\")\n else:\n if score_type == \"true_score\":\n score = calculation.get_true_score(input_value, metric.get(\"direction\"))\n elif score_type == \"score_mapping\":\n score = calculation.get_mapped_score(input_value, metric.get(\"score_map\"))\n elif score_type == \"ranges\":\n score = calculation.get_range_score(input_value, metric.get(\"ranges\"), metric.get(\"direction\"))\n elif score_type == \"score_map_value\":\n score = calculation.get_map_value_score(input_value, metric.get(\"score_map\"))\n elif score_type == \"scaled_score\":\n score = calculation.get_scaled_score(input_value, metric.get(\"scale\"), metric.get(\"direction\"))\n elif score_type == \"property_check\":\n score = 0 if input_value is None else input_value\n\n else:\n logger.warning(f\"The score type {score_type} is not yet implemented.\")\n\n except KeyError:\n logger.warning(f\"Null input for {name} metric\")\n score = round(score, 2)\n result.append({name: {\"score\": score}})\n return score\n
evaluate()
Evaluate the trust score for the pillar.
Score of [0, 1].
def evaluate(self):\n \"\"\"\n Evaluate the trust score for the pillar.\n\n Returns:\n float: Score of [0, 1].\n \"\"\"\n score = 0\n avg_weight = 1 / len(self.metrics)\n for key, value in self.metrics.items():\n weight = value.get(\"weight\", avg_weight) if self.use_weights else avg_weight\n score += weight * self.get_notion_score(key, value.get(\"metrics\"))\n score = round(score, 2)\n return score, {self.name: {\"score\": score, \"notions\": self.result}}\n
get_metric_score(result, name, metric)
Evaluate the trust score for the metric.
result
The result object
Name of the metric.
The metric definition.
def get_metric_score(self, result, name, metric):\n \"\"\"\n Evaluate the trust score for the metric.\n\n Args:\n result (object): The result object\n name (string): Name of the metric.\n metrics (dict): The metric definition.\n\n Returns:\n float: Score of [0, 1].\n \"\"\"\n\n score = 0\n try:\n input_value = get_input_value(self.input_docs, metric.get(\"inputs\"), metric.get(\"operation\"))\n\n score_type = metric.get(\"type\")\n if input_value is None:\n logger.warning(f\"{name} input value is null\")\n else:\n if score_type == \"true_score\":\n score = calculation.get_true_score(input_value, metric.get(\"direction\"))\n elif score_type == \"score_mapping\":\n score = calculation.get_mapped_score(input_value, metric.get(\"score_map\"))\n elif score_type == \"ranges\":\n score = calculation.get_range_score(input_value, metric.get(\"ranges\"), metric.get(\"direction\"))\n elif score_type == \"score_map_value\":\n score = calculation.get_map_value_score(input_value, metric.get(\"score_map\"))\n elif score_type == \"scaled_score\":\n score = calculation.get_scaled_score(input_value, metric.get(\"scale\"), metric.get(\"direction\"))\n elif score_type == \"property_check\":\n score = 0 if input_value is None else input_value\n\n else:\n logger.warning(f\"The score type {score_type} is not yet implemented.\")\n\n except KeyError:\n logger.warning(f\"Null input for {name} metric\")\n score = round(score, 2)\n result.append({name: {\"score\": score}})\n return score\n
get_notion_score(name, metrics)
Evaluate the trust score for the notion.
Name of the notion.
Metrics definitions of the notion.
def get_notion_score(self, name, metrics):\n \"\"\"\n Evaluate the trust score for the notion.\n\n Args:\n name (string): Name of the notion.\n metrics (list): Metrics definitions of the notion.\n\n Returns:\n float: Score of [0, 1].\n \"\"\"\n\n notion_score = 0\n avg_weight = 1 / len(metrics)\n metrics_result = []\n for key, value in metrics.items():\n metric_score = self.get_metric_score(metrics_result, key, value)\n weight = value.get(\"weight\", avg_weight) if self.use_weights else avg_weight\n notion_score += weight * float(metric_score)\n self.result.append({name: {\"score\": notion_score, \"metrics\": metrics_result}})\n return notion_score\n
check_field_filled(factsheet_dict, factsheet_path, value, empty='')
Check if the field in the factsheet file is filled or not.
factsheet_dict
The factshett dict.
factsheet_path
The factsheet field to check.
The value to add in the field.
empty
If the value could not be appended, the empty string is returned.
''
The value added in the factsheet or empty if the value could not be appened
nebula/addons/trustworthiness/utils.py
def check_field_filled(factsheet_dict, factsheet_path, value, empty=\"\"):\n \"\"\"\n Check if the field in the factsheet file is filled or not.\n\n Args:\n factsheet_dict (dict): The factshett dict.\n factsheet_path (list): The factsheet field to check.\n value (float): The value to add in the field.\n empty (string): If the value could not be appended, the empty string is returned.\n\n Returns:\n float: The value added in the factsheet or empty if the value could not be appened\n\n \"\"\"\n if factsheet_dict[factsheet_path[0]][factsheet_path[1]]:\n return factsheet_dict[factsheet_path[0]][factsheet_path[1]]\n elif value != \"\" and value != \"nan\":\n if type(value) != str and type(value) != list:\n if math.isnan(value):\n return 0\n else:\n return value\n else:\n return value\n else:\n return empty\n
count_class_samples(scenario_name, dataloaders_files)
Counts the number of samples by class.
Name of the scenario.
dataloaders_files
Files that contain the dataloaders.
def count_class_samples(scenario_name, dataloaders_files):\n \"\"\"\n Counts the number of samples by class.\n\n Args:\n scenario_name (string): Name of the scenario.\n dataloaders_files (list): Files that contain the dataloaders.\n\n \"\"\"\n\n result = {}\n dataloaders = []\n\n for file in dataloaders_files:\n with open(file, \"rb\") as f:\n dataloader = pickle.load(f)\n dataloaders.append(dataloader)\n\n for dataloader in dataloaders:\n for batch, labels in dataloader:\n for b, label in zip(batch, labels, strict=False):\n l = hashids.encode(label.item())\n if l in result:\n result[l] += 1\n else:\n result[l] = 1\n\n name_file = f\"{dirname}/files/{scenario_name}/count_class.json\"\n with open(name_file, \"w\") as f:\n json.dump(result, f)\n
get_entropy(client_id, scenario_name, dataloader)
Get the entropy of each client in the scenario.
client_id
The client id.
def get_entropy(client_id, scenario_name, dataloader):\n \"\"\"\n Get the entropy of each client in the scenario.\n\n Args:\n client_id (int): The client id.\n scenario_name (string): Name of the scenario.\n dataloaders_files (list): Files that contain the dataloaders.\n\n \"\"\"\n result = {}\n client_entropy = {}\n\n name_file = f\"{dirname}/files/{scenario_name}/entropy.json\"\n if os.path.exists(name_file):\n with open(name_file) as f:\n client_entropy = json.load(f)\n\n client_id_hash = hashids.encode(client_id)\n\n for batch, labels in dataloader:\n for b, label in zip(batch, labels, strict=False):\n l = hashids.encode(label.item())\n if l in result:\n result[l] += 1\n else:\n result[l] = 1\n\n n = len(dataloader)\n entropy_value = entropy([x / n for x in result.values()], base=2)\n client_entropy[client_id_hash] = entropy_value\n with open(name_file, \"w\") as f:\n json.dump(client_entropy, f)\n
get_input_value(input_docs, inputs, operation)
Gets the input value from input document and apply the metric operation on the value.
inputs_docs
map
The input document map.
inputs
All the inputs.
operation
The metric operation.
The metric value
def get_input_value(input_docs, inputs, operation):\n \"\"\"\n Gets the input value from input document and apply the metric operation on the value.\n\n Args:\n inputs_docs (map): The input document map.\n inputs (list): All the inputs.\n operation (string): The metric operation.\n\n Returns:\n float: The metric value\n\n \"\"\"\n\n input_value = None\n args = []\n for i in inputs:\n source = i.get(\"source\", \"\")\n field = i.get(\"field_path\", \"\")\n input_doc = input_docs.get(source, None)\n if input_doc is None:\n logger.warning(f\"{source} is null\")\n else:\n input = get_value_from_path(input_doc, field)\n args.append(input)\n try:\n operationFn = getattr(calculation, operation)\n input_value = operationFn(*args)\n except TypeError:\n logger.warning(f\"{operation} is not valid\")\n\n return input_value\n
get_value_from_path(input_doc, path)
Gets the input value from input document by path.
inputs_doc
path
The field name of the input value of interest.
The input value from the input document
def get_value_from_path(input_doc, path):\n \"\"\"\n Gets the input value from input document by path.\n\n Args:\n inputs_doc (map): The input document map.\n path (string): The field name of the input value of interest.\n\n Returns:\n float: The input value from the input document\n\n \"\"\"\n\n d = input_doc\n for nested_key in path.split(\"/\"):\n temp = d.get(nested_key)\n if isinstance(temp, dict):\n d = d.get(nested_key)\n else:\n return temp\n return None\n
read_csv(filename)
Read a CSV file.
filename
Name of the file.
The CSV readed.
def read_csv(filename):\n \"\"\"\n Read a CSV file.\n\n Args:\n filename (string): Name of the file.\n\n Returns:\n object: The CSV readed.\n\n \"\"\"\n if exists(filename):\n return pd.read_csv(filename)\n
write_results_json(out_file, dict)
Writes the result to JSON.
out_file
The output file.
The object to be witten into JSON.
def write_results_json(out_file, dict):\n \"\"\"\n Writes the result to JSON.\n\n Args:\n out_file (string): The output file.\n dict (dict): The object to be witten into JSON.\n\n Returns:\n float: The input value from the input document\n\n \"\"\"\n\n with open(out_file, \"a\") as f:\n json.dump(dict, f, indent=4)\n
Engine
nebula/core/engine.py
class Engine:\n def __init__(\n self,\n model,\n dataset,\n config=Config,\n trainer=Lightning,\n security=False,\n model_poisoning=False,\n poisoned_ratio=0,\n noise_type=\"gaussian\",\n ):\n self.config = config\n self.idx = config.participant[\"device_args\"][\"idx\"]\n self.experiment_name = config.participant[\"scenario_args\"][\"name\"]\n self.ip = config.participant[\"network_args\"][\"ip\"]\n self.port = config.participant[\"network_args\"][\"port\"]\n self.addr = config.participant[\"network_args\"][\"addr\"]\n self.role = config.participant[\"device_args\"][\"role\"]\n self.name = config.participant[\"device_args\"][\"name\"]\n self.docker_id = config.participant[\"device_args\"][\"docker_id\"]\n self.client = docker.from_env()\n\n print_banner()\n\n print_msg_box(\n msg=f\"Name {self.name}\\nRole: {self.role}\",\n indent=2,\n title=\"Node information\",\n )\n\n self._trainer = None\n self._aggregator = None\n self.round = None\n self.total_rounds = None\n self.federation_nodes = set()\n self.initialized = False\n self.log_dir = os.path.join(config.participant[\"tracking_args\"][\"log_dir\"], self.experiment_name)\n\n self.security = security\n self.model_poisoning = model_poisoning\n self.poisoned_ratio = poisoned_ratio\n self.noise_type = noise_type\n\n self._trainer = trainer(model, dataset, config=self.config)\n self._aggregator = create_aggregator(config=self.config, engine=self)\n\n self._secure_neighbors = []\n self._is_malicious = True if self.config.participant[\"adversarial_args\"][\"attacks\"] != \"No Attack\" else False\n\n msg = f\"Trainer: {self._trainer.__class__.__name__}\"\n msg += f\"\\nDataset: {self.config.participant['data_args']['dataset']}\"\n msg += f\"\\nIID: {self.config.participant['data_args']['iid']}\"\n msg += f\"\\nModel: {model.__class__.__name__}\"\n msg += f\"\\nAggregation algorithm: {self._aggregator.__class__.__name__}\"\n msg += f\"\\nNode behavior: {'malicious' if self._is_malicious else 'benign'}\"\n print_msg_box(msg=msg, indent=2, title=\"Scenario information\")\n print_msg_box(\n msg=f\"Logging type: {self._trainer.logger.__class__.__name__}\",\n indent=2,\n title=\"Logging information\",\n )\n\n self.with_reputation = self.config.participant[\"defense_args\"][\"with_reputation\"]\n self.is_dynamic_topology = self.config.participant[\"defense_args\"][\"is_dynamic_topology\"]\n self.is_dynamic_aggregation = self.config.participant[\"defense_args\"][\"is_dynamic_aggregation\"]\n self.target_aggregation = (\n create_target_aggregator(config=self.config, engine=self) if self.is_dynamic_aggregation else None\n )\n msg = f\"Reputation system: {self.with_reputation}\\nDynamic topology: {self.is_dynamic_topology}\\nDynamic aggregation: {self.is_dynamic_aggregation}\"\n msg += (\n f\"\\nTarget aggregation: {self.target_aggregation.__class__.__name__}\" if self.is_dynamic_aggregation else \"\"\n )\n print_msg_box(msg=msg, indent=2, title=\"Defense information\")\n\n self.learning_cycle_lock = Locker(name=\"learning_cycle_lock\", async_lock=True)\n self.federation_setup_lock = Locker(name=\"federation_setup_lock\", async_lock=True)\n self.federation_ready_lock = Locker(name=\"federation_ready_lock\", async_lock=True)\n self.round_lock = Locker(name=\"round_lock\", async_lock=True)\n\n self.config.reload_config_file()\n\n self._cm = CommunicationsManager(engine=self)\n # Set the communication manager in the model (send messages from there)\n self.trainer.model.set_communication_manager(self._cm)\n\n self._reporter = Reporter(config=self.config, trainer=self.trainer, cm=self.cm)\n\n self._event_manager = EventManager(\n default_callbacks=[\n self._discovery_discover_callback,\n self._control_alive_callback,\n self._connection_connect_callback,\n self._connection_disconnect_callback,\n self._federation_ready_callback,\n self._start_federation_callback,\n self._federation_models_included_callback,\n ]\n )\n\n # Register additional callbacks\n self._event_manager.register_event(\n (\n nebula_pb2.FederationMessage,\n nebula_pb2.FederationMessage.Action.REPUTATION,\n ),\n self._reputation_callback,\n )\n # ... add more callbacks here\n\n @property\n def cm(self):\n return self._cm\n\n @property\n def reporter(self):\n return self._reporter\n\n @property\n def event_manager(self):\n return self._event_manager\n\n @property\n def aggregator(self):\n return self._aggregator\n\n def get_aggregator_type(self):\n return type(self.aggregator)\n\n @property\n def trainer(self):\n return self._trainer\n\n def get_addr(self):\n return self.addr\n\n def get_config(self):\n return self.config\n\n def get_federation_nodes(self):\n return self.federation_nodes\n\n def get_initialization_status(self):\n return self.initialized\n\n def set_initialization_status(self, status):\n self.initialized = status\n\n def get_round(self):\n return self.round\n\n def get_federation_ready_lock(self):\n return self.federation_ready_lock\n\n def get_federation_setup_lock(self):\n return self.federation_setup_lock\n\n def get_round_lock(self):\n return self.round_lock\n\n @event_handler(nebula_pb2.DiscoveryMessage, nebula_pb2.DiscoveryMessage.Action.DISCOVER)\n async def _discovery_discover_callback(self, source, message):\n logging.info(\n f\"\ud83d\udd0d handle_discovery_message | Trigger | Received discovery message from {source} (network propagation)\"\n )\n current_connections = await self.cm.get_addrs_current_connections(myself=True)\n if source not in current_connections:\n logging.info(f\"\ud83d\udd0d handle_discovery_message | Trigger | Connecting to {source} indirectly\")\n await self.cm.connect(source, direct=False)\n async with self.cm.get_connections_lock():\n if source in self.cm.connections:\n # Update the latitude and longitude of the node (if already connected)\n if (\n message.latitude is not None\n and -90 <= message.latitude <= 90\n and message.longitude is not None\n and -180 <= message.longitude <= 180\n ):\n self.cm.connections[source].update_geolocation(message.latitude, message.longitude)\n else:\n logging.warning(\n f\"\ud83d\udd0d Invalid geolocation received from {source}: latitude={message.latitude}, longitude={message.longitude}\"\n )\n\n @event_handler(nebula_pb2.ControlMessage, nebula_pb2.ControlMessage.Action.ALIVE)\n async def _control_alive_callback(self, source, message):\n logging.info(f\"\ud83d\udd27 handle_control_message | Trigger | Received alive message from {source}\")\n current_connections = await self.cm.get_addrs_current_connections(myself=True)\n if source in current_connections:\n try:\n await self.cm.health.alive(source)\n except Exception as e:\n logging.exception(f\"Error updating alive status in connection: {e}\")\n else:\n logging.error(f\"\u2757\ufe0f Connection {source} not found in connections...\")\n\n @event_handler(nebula_pb2.ConnectionMessage, nebula_pb2.ConnectionMessage.Action.CONNECT)\n async def _connection_connect_callback(self, source, message):\n logging.info(f\"\ud83d\udd17 handle_connection_message | Trigger | Received connection message from {source}\")\n current_connections = await self.cm.get_addrs_current_connections(myself=True)\n if source not in current_connections:\n logging.info(f\"\ud83d\udd17 handle_connection_message | Trigger | Connecting to {source}\")\n await self.cm.connect(source, direct=True)\n\n @event_handler(nebula_pb2.ConnectionMessage, nebula_pb2.ConnectionMessage.Action.DISCONNECT)\n async def _connection_disconnect_callback(self, source, message):\n logging.info(f\"\ud83d\udd17 handle_connection_message | Trigger | Received disconnection message from {source}\")\n await self.cm.disconnect(source, mutual_disconnection=False)\n\n @event_handler(\n nebula_pb2.FederationMessage,\n nebula_pb2.FederationMessage.Action.FEDERATION_READY,\n )\n async def _federation_ready_callback(self, source, message):\n logging.info(f\"\ud83d\udcdd handle_federation_message | Trigger | Received ready federation message from {source}\")\n if self.config.participant[\"device_args\"][\"start\"]:\n logging.info(f\"\ud83d\udcdd handle_federation_message | Trigger | Adding ready connection {source}\")\n await self.cm.add_ready_connection(source)\n\n @event_handler(\n nebula_pb2.FederationMessage,\n nebula_pb2.FederationMessage.Action.FEDERATION_START,\n )\n async def _start_federation_callback(self, source, message):\n logging.info(f\"\ud83d\udcdd handle_federation_message | Trigger | Received start federation message from {source}\")\n await self.create_trainer_module()\n\n @event_handler(nebula_pb2.FederationMessage, nebula_pb2.FederationMessage.Action.REPUTATION)\n async def _reputation_callback(self, source, message):\n malicious_nodes = message.arguments # List of malicious nodes\n if self.with_reputation:\n if len(malicious_nodes) > 0 and not self._is_malicious:\n if self.is_dynamic_topology:\n await self._disrupt_connection_using_reputation(malicious_nodes)\n if self.is_dynamic_aggregation and self.aggregator != self.target_aggregation:\n await self._dynamic_aggregator(\n self.aggregator.get_nodes_pending_models_to_aggregate(),\n malicious_nodes,\n )\n\n @event_handler(\n nebula_pb2.FederationMessage,\n nebula_pb2.FederationMessage.Action.FEDERATION_MODELS_INCLUDED,\n )\n async def _federation_models_included_callback(self, source, message):\n logging.info(f\"\ud83d\udcdd handle_federation_message | Trigger | Received aggregation finished message from {source}\")\n try:\n await self.cm.get_connections_lock().acquire_async()\n if self.round is not None and source in self.cm.connections:\n try:\n if message is not None and len(message.arguments) > 0:\n self.cm.connections[source].update_round(int(message.arguments[0])) if message.round in [\n self.round - 1,\n self.round,\n ] else None\n except Exception as e:\n logging.exception(f\"Error updating round in connection: {e}\")\n else:\n logging.error(f\"Connection not found for {source}\")\n except Exception as e:\n logging.exception(f\"Error updating round in connection: {e}\")\n finally:\n await self.cm.get_connections_lock().release_async()\n\n async def create_trainer_module(self):\n asyncio.create_task(self._start_learning())\n logging.info(\"Started trainer module...\")\n\n async def start_communications(self):\n logging.info(f\"Neighbors: {self.config.participant['network_args']['neighbors']}\")\n logging.info(\n f\"\ud83d\udca4 Cold start time: {self.config.participant['misc_args']['grace_time_connection']} seconds before connecting to the network\"\n )\n await asyncio.sleep(self.config.participant[\"misc_args\"][\"grace_time_connection\"])\n await self.cm.start()\n initial_neighbors = self.config.participant[\"network_args\"][\"neighbors\"].split()\n for i in initial_neighbors:\n addr = f\"{i.split(':')[0]}:{i.split(':')[1]}\"\n await self.cm.connect(addr, direct=True)\n await asyncio.sleep(1)\n while not self.cm.verify_connections(initial_neighbors):\n await asyncio.sleep(1)\n current_connections = await self.cm.get_addrs_current_connections()\n logging.info(f\"Connections verified: {current_connections}\")\n await self._reporter.start()\n await self.cm.deploy_additional_services()\n await asyncio.sleep(self.config.participant[\"misc_args\"][\"grace_time_connection\"] // 2)\n\n async def deploy_federation(self):\n await self.federation_ready_lock.acquire_async()\n if self.config.participant[\"device_args\"][\"start\"]:\n logging.info(\n f\"\ud83d\udca4 Waiting for {self.config.participant['misc_args']['grace_time_start_federation']} seconds to start the federation\"\n )\n await asyncio.sleep(self.config.participant[\"misc_args\"][\"grace_time_start_federation\"])\n if self.round is None:\n while not await self.cm.check_federation_ready():\n await asyncio.sleep(1)\n logging.info(\"Sending FEDERATION_START to neighbors...\")\n message = self.cm.mm.generate_federation_message(nebula_pb2.FederationMessage.Action.FEDERATION_START)\n await self.cm.send_message_to_neighbors(message)\n await self.get_federation_ready_lock().release_async()\n await self.create_trainer_module()\n else:\n logging.info(\"Federation already started\")\n\n else:\n logging.info(\"Sending FEDERATION_READY to neighbors...\")\n message = self.cm.mm.generate_federation_message(nebula_pb2.FederationMessage.Action.FEDERATION_READY)\n await self.cm.send_message_to_neighbors(message)\n logging.info(\"\ud83d\udca4 Waiting until receiving the start signal from the start node\")\n\n async def _start_learning(self):\n await self.learning_cycle_lock.acquire_async()\n try:\n if self.round is None:\n self.total_rounds = self.config.participant[\"scenario_args\"][\"rounds\"]\n epochs = self.config.participant[\"training_args\"][\"epochs\"]\n await self.get_round_lock().acquire_async()\n self.round = 0\n await self.get_round_lock().release_async()\n await self.learning_cycle_lock.release_async()\n print_msg_box(\n msg=\"Starting Federated Learning process...\",\n indent=2,\n title=\"Start of the experiment\",\n )\n direct_connections = await self.cm.get_addrs_current_connections(only_direct=True)\n undirected_connections = await self.cm.get_addrs_current_connections(only_undirected=True)\n logging.info(\n f\"Initial DIRECT connections: {direct_connections} | Initial UNDIRECT participants: {undirected_connections}\"\n )\n logging.info(\"\ud83d\udca4 Waiting initialization of the federation...\")\n # Lock to wait for the federation to be ready (only affects the first round, when the learning starts)\n # Only applies to non-start nodes --> start node does not wait for the federation to be ready\n await self.get_federation_ready_lock().acquire_async()\n if self.config.participant[\"device_args\"][\"start\"]:\n logging.info(\"Propagate initial model updates.\")\n await self.cm.propagator.propagate(\"initialization\")\n await self.get_federation_ready_lock().release_async()\n\n self.trainer.set_epochs(epochs)\n self.trainer.create_trainer()\n\n await self._learning_cycle()\n else:\n if await self.learning_cycle_lock.locked_async():\n await self.learning_cycle_lock.release_async()\n finally:\n if await self.learning_cycle_lock.locked_async():\n await self.learning_cycle_lock.release_async()\n\n async def _disrupt_connection_using_reputation(self, malicious_nodes):\n malicious_nodes = list(set(malicious_nodes) & set(self.get_current_connections()))\n logging.info(f\"Disrupting connection with malicious nodes at round {self.round}\")\n logging.info(f\"Removing {malicious_nodes} from {self.get_current_connections()}\")\n logging.info(f\"Current connections before aggregation at round {self.round}: {self.get_current_connections()}\")\n for malicious_node in malicious_nodes:\n if (self.get_name() != malicious_node) and (malicious_node not in self._secure_neighbors):\n await self.cm.disconnect(malicious_node)\n logging.info(f\"Current connections after aggregation at round {self.round}: {self.get_current_connections()}\")\n\n await self._connect_with_benign(malicious_nodes)\n\n async def _connect_with_benign(self, malicious_nodes):\n lower_threshold = 1\n higher_threshold = len(self.federation_nodes) - 1\n if higher_threshold < lower_threshold:\n higher_threshold = lower_threshold\n\n benign_nodes = [i for i in self.federation_nodes if i not in malicious_nodes]\n logging.info(f\"_reputation_callback benign_nodes at round {self.round}: {benign_nodes}\")\n if len(self.get_current_connections()) <= lower_threshold:\n for node in benign_nodes:\n if len(self.get_current_connections()) <= higher_threshold and self.get_name() != node:\n connected = await self.cm.connect(node)\n if connected:\n logging.info(f\"Connect new connection with at round {self.round}: {connected}\")\n\n async def _dynamic_aggregator(self, aggregated_models_weights, malicious_nodes):\n logging.info(f\"malicious detected at round {self.round}, change aggergation protocol!\")\n if self.aggregator != self.target_aggregation:\n logging.info(f\"Current aggregator is: {self.aggregator}\")\n self.aggregator = self.target_aggregation\n await self.aggregator.update_federation_nodes(self.federation_nodes)\n\n for subnodes in aggregated_models_weights.keys():\n sublist = subnodes.split()\n (submodel, weights) = aggregated_models_weights[subnodes]\n for node in sublist:\n if node not in malicious_nodes:\n await self.aggregator.include_model_in_buffer(\n submodel, weights, source=self.get_name(), round=self.round\n )\n logging.info(f\"Current aggregator is: {self.aggregator}\")\n\n async def _waiting_model_updates(self):\n logging.info(f\"\ud83d\udca4 Waiting convergence in round {self.round}.\")\n params = await self.aggregator.get_aggregation()\n if params is not None:\n logging.info(\n f\"_waiting_model_updates | Aggregation done for round {self.round}, including parameters in local model.\"\n )\n self.trainer.set_model_parameters(params)\n else:\n logging.error(\"Aggregation finished with no parameters\")\n\n async def _learning_cycle(self):\n while self.round is not None and self.round < self.total_rounds:\n print_msg_box(\n msg=f\"Round {self.round} of {self.total_rounds} started.\",\n indent=2,\n title=\"Round information\",\n )\n self.trainer.on_round_start()\n self.federation_nodes = await self.cm.get_addrs_current_connections(only_direct=True, myself=True)\n logging.info(f\"Federation nodes: {self.federation_nodes}\")\n direct_connections = await self.cm.get_addrs_current_connections(only_direct=True)\n undirected_connections = await self.cm.get_addrs_current_connections(only_undirected=True)\n logging.info(f\"Direct connections: {direct_connections} | Undirected connections: {undirected_connections}\")\n logging.info(f\"[Role {self.role}] Starting learning cycle...\")\n await self.aggregator.update_federation_nodes(self.federation_nodes)\n await self._extended_learning_cycle()\n\n await self.get_round_lock().acquire_async()\n print_msg_box(\n msg=f\"Round {self.round} of {self.total_rounds} finished.\",\n indent=2,\n title=\"Round information\",\n )\n await self.aggregator.reset()\n self.trainer.on_round_end()\n self.round = self.round + 1\n self.config.participant[\"federation_args\"][\"round\"] = (\n self.round\n ) # Set current round in config (send to the controller)\n await self.get_round_lock().release_async()\n\n # End of the learning cycle\n self.trainer.on_learning_cycle_end()\n await self.trainer.test()\n self.round = None\n self.total_rounds = None\n print_msg_box(\n msg=\"Federated Learning process has been completed.\",\n indent=2,\n title=\"End of the experiment\",\n )\n # Report\n if self.config.participant[\"scenario_args\"][\"controller\"] != \"nebula-test\":\n result = await self.reporter.report_scenario_finished()\n if result:\n pass\n else:\n logging.error(\"Error reporting scenario finished\")\n\n logging.info(\"Checking if all my connections reached the total rounds...\")\n while not self.cm.check_finished_experiment():\n await asyncio.sleep(1)\n\n # Enable loggin info\n logging.getLogger().disabled = True\n\n # Kill itself\n if self.config.participant[\"scenario_args\"][\"deployment\"] == \"docker\":\n try:\n self.client.containers.get(self.docker_id).stop()\n except Exception as e:\n print(f\"Error stopping Docker container with ID {self.docker_id}: {e}\")\n\n async def _extended_learning_cycle(self):\n \"\"\"\n This method is called in each round of the learning cycle. It is used to extend the learning cycle with additional\n functionalities. The method is called in the _learning_cycle method.\n \"\"\"\n pass\n\n def reputation_calculation(self, aggregated_models_weights):\n cossim_threshold = 0.5\n loss_threshold = 0.5\n\n current_models = {}\n for subnodes in aggregated_models_weights.keys():\n sublist = subnodes.split()\n submodel = aggregated_models_weights[subnodes][0]\n for node in sublist:\n current_models[node] = submodel\n\n malicious_nodes = []\n reputation_score = {}\n local_model = self.trainer.get_model_parameters()\n untrusted_nodes = list(current_models.keys())\n logging.info(f\"reputation_calculation untrusted_nodes at round {self.round}: {untrusted_nodes}\")\n\n for untrusted_node in untrusted_nodes:\n logging.info(f\"reputation_calculation untrusted_node at round {self.round}: {untrusted_node}\")\n logging.info(f\"reputation_calculation self.get_name() at round {self.round}: {self.get_name()}\")\n if untrusted_node != self.get_name():\n untrusted_model = current_models[untrusted_node]\n cossim = cosine_metric(local_model, untrusted_model, similarity=True)\n logging.info(f\"reputation_calculation cossim at round {self.round}: {untrusted_node}: {cossim}\")\n self.trainer._logger.log_data({f\"Reputation/cossim_{untrusted_node}\": cossim}, step=self.round)\n\n avg_loss = self.trainer.validate_neighbour_model(untrusted_model)\n logging.info(f\"reputation_calculation avg_loss at round {self.round} {untrusted_node}: {avg_loss}\")\n self.trainer._logger.log_data({f\"Reputation/avg_loss_{untrusted_node}\": avg_loss}, step=self.round)\n reputation_score[untrusted_node] = (cossim, avg_loss)\n\n if cossim < cossim_threshold or avg_loss > loss_threshold:\n malicious_nodes.append(untrusted_node)\n else:\n self._secure_neighbors.append(untrusted_node)\n\n return malicious_nodes, reputation_score\n\n async def send_reputation(self, malicious_nodes):\n logging.info(f\"Sending REPUTATION to the rest of the topology: {malicious_nodes}\")\n message = self.cm.mm.generate_federation_message(\n nebula_pb2.FederationMessage.Action.REPUTATION, malicious_nodes\n )\n await self.cm.send_message_to_neighbors(message)\n
EventManager
nebula/core/eventmanager.py
class EventManager:\n def __init__(self, default_callbacks=None):\n self._event_callbacks = defaultdict(list)\n self._register_default_callbacks(default_callbacks or [])\n\n def _register_default_callbacks(self, default_callbacks):\n \"\"\"Registers default callbacks for events.\"\"\"\n for callback in default_callbacks:\n handler_info = getattr(callback, \"_event_handler\", None)\n if handler_info is not None:\n self.register_event(handler_info, callback)\n else:\n raise ValueError(\"The callback must be decorated with @event_handler.\")\n\n def register_event(self, handler_info, callback):\n \"\"\"Records a callback for a specific event.\"\"\"\n if callable(callback):\n self._event_callbacks[handler_info].append(callback)\n else:\n raise ValueError(\"The callback must be a callable function.\")\n\n def unregister_event(self, handler_info, callback):\n \"\"\"Unregisters a previously registered callback for an event.\"\"\"\n if callback in self._event_callbacks[handler_info]:\n self._event_callbacks[handler_info].remove(callback)\n\n async def trigger_event(self, source, message, *args, **kwargs):\n \"\"\"Triggers an event, executing all associated callbacks.\"\"\"\n message_type = message.DESCRIPTOR.full_name\n if hasattr(message, \"action\"):\n action_name = message.Action.Name(message.action)\n else:\n action_name = \"None\"\n\n handler_info = (message_type, action_name)\n\n if handler_info in self._event_callbacks:\n for callback in self._event_callbacks[handler_info]:\n try:\n if asyncio.iscoroutinefunction(callback) or inspect.iscoroutine(callback):\n await callback(source, message, *args, **kwargs)\n else:\n callback(source, message, *args, **kwargs)\n except Exception as e:\n logging.exception(f\"Error executing callback for {handler_info}: {e}\")\n else:\n logging.error(f\"No callbacks registered for event {handler_info}\")\n\n async def get_event_callbacks(self, event_name):\n \"\"\"Returns the callbacks for a specific event.\"\"\"\n return self._event_callbacks[event_name]\n\n def get_event_callbacks_names(self):\n \"\"\"Returns the names of the registered events.\"\"\"\n return self._event_callbacks.keys()\n
get_event_callbacks(event_name)
async
Returns the callbacks for a specific event.
async def get_event_callbacks(self, event_name):\n \"\"\"Returns the callbacks for a specific event.\"\"\"\n return self._event_callbacks[event_name]\n
get_event_callbacks_names()
Returns the names of the registered events.
def get_event_callbacks_names(self):\n \"\"\"Returns the names of the registered events.\"\"\"\n return self._event_callbacks.keys()\n
register_event(handler_info, callback)
Records a callback for a specific event.
def register_event(self, handler_info, callback):\n \"\"\"Records a callback for a specific event.\"\"\"\n if callable(callback):\n self._event_callbacks[handler_info].append(callback)\n else:\n raise ValueError(\"The callback must be a callable function.\")\n
trigger_event(source, message, *args, **kwargs)
Triggers an event, executing all associated callbacks.
async def trigger_event(self, source, message, *args, **kwargs):\n \"\"\"Triggers an event, executing all associated callbacks.\"\"\"\n message_type = message.DESCRIPTOR.full_name\n if hasattr(message, \"action\"):\n action_name = message.Action.Name(message.action)\n else:\n action_name = \"None\"\n\n handler_info = (message_type, action_name)\n\n if handler_info in self._event_callbacks:\n for callback in self._event_callbacks[handler_info]:\n try:\n if asyncio.iscoroutinefunction(callback) or inspect.iscoroutine(callback):\n await callback(source, message, *args, **kwargs)\n else:\n callback(source, message, *args, **kwargs)\n except Exception as e:\n logging.exception(f\"Error executing callback for {handler_info}: {e}\")\n else:\n logging.error(f\"No callbacks registered for event {handler_info}\")\n
unregister_event(handler_info, callback)
Unregisters a previously registered callback for an event.
def unregister_event(self, handler_info, callback):\n \"\"\"Unregisters a previously registered callback for an event.\"\"\"\n if callback in self._event_callbacks[handler_info]:\n self._event_callbacks[handler_info].remove(callback)\n
event_handler(message_type, action)
Decorator for registering an event handler.
def event_handler(message_type, action):\n \"\"\"Decorator for registering an event handler.\"\"\"\n\n def decorator(func):\n @wraps(func)\n async def async_wrapper(*args, **kwargs):\n return await func(*args, **kwargs)\n\n @wraps(func)\n def sync_wrapper(*args, **kwargs):\n return func(*args, **kwargs)\n\n if asyncio.iscoroutinefunction(func):\n wrapper = async_wrapper\n else:\n wrapper = sync_wrapper\n\n action_name = message_type.Action.Name(action) if action is not None else \"None\"\n wrapper._event_handler = (message_type.DESCRIPTOR.full_name, action_name)\n return wrapper\n\n return decorator\n
Role
This class defines the participant roles of the platform.
nebula/core/role.py
class Role:\n \"\"\"\n This class defines the participant roles of the platform.\n \"\"\"\n\n TRAINER = \"trainer\"\n AGGREGATOR = \"aggregator\"\n PROXY = \"proxy\"\n IDLE = \"idle\"\n SERVER = \"server\"\n
BlockchainHandler
Handles interaction with Oracle and Non-Validator Node of Blockchain Network
nebula/core/aggregation/blockchainReputation.py
class BlockchainHandler:\n \"\"\"\n Handles interaction with Oracle and Non-Validator Node of Blockchain Network\n \"\"\"\n\n # static ip address of non-validator node with RPC-API\n __rpc_url = \"http://172.25.0.104:8545\"\n\n # static ip address of oracle with REST-API\n __oracle_url = \"http://172.25.0.105:8081\"\n\n # default REST header for interacting with oracle\n __rest_header = {\"Content-type\": \"application/json\", \"Accept\": \"application/json\"}\n\n def __init__(self, home_address):\n print_with_frame(\"BLOCKCHAIN INITIALIZATION: START\")\n\n # local NEBULA name, needed for reputation system\n self.__home_ip = home_address\n\n # randomly generated private key, needed to sign transaction\n self.__private_key = \"\"\n\n # public wallet address generated from the private key\n self.__acc_address = \"\"\n\n # variables for experiment, not required for aggregation\n self.__gas_used = 0\n self.__gas_price = 27.3\n self.round = 1\n\n # generate randomized primary key\n self.__acc = self.__create_account()\n\n # configure web3 objects for using Proof-of-Authority\n self.__web3 = self.__initialize_web3()\n\n # call Oracle to sense if blockchain is ready\n print(f\"{'-' * 25} CONNECT TO ORACLE {'-' * 25}\", flush=True)\n self.__wait_for_blockchain()\n\n # request ETH funds for creating transactions, paying gas\n self.__request_funds_from_oracle()\n\n # check if funds were assigned by checking directly with blockchain\n self.verify_balance()\n\n # request contract address and header from Oracle\n self.__contract_obj = self.__get_contract_from_oracle()\n\n # register public wallet address at reputation system\n print(f\"{'-' * 25} CONNECT TO REPUTATION SYSTEM {'-' * 25}\", flush=True)\n self.__register()\n print(\"BLOCKCHAIN: Registered to reputation system\", flush=True)\n\n # check if registration was successful\n self.verify_registration()\n print(\"BLOCKCHAIN: Verified registration\", flush=True)\n\n print_with_frame(\"BLOCKCHAIN INITIALIZATION: FINISHED\")\n\n @classmethod\n @property\n def oracle_url(cls) -> str:\n return cls.__oracle_url\n\n @classmethod\n @property\n def rest_header(cls) -> Mapping[str, str]:\n return cls.__rest_header\n\n def __create_account(self):\n \"\"\"\n Generates randomized primary key and derives public account from it\n Returns: None\n\n \"\"\"\n print(f\"{'-' * 25} REGISTER WORKING NODE {'-' * 25}\", flush=True)\n\n # generate random private key, address, public address\n acc = Account.create()\n\n # initialize web3 utility object\n web3 = Web3()\n\n # convert private key to hex, used in raw transactions\n self.__private_key = web3.to_hex(acc.key)\n\n # convert address type, used in raw transactions\n self.__acc_address = web3.to_checksum_address(acc.address)\n\n print(f\"WORKER NODE: Registered account: {self.__home_ip}\", flush=True)\n print(f\"WORKER NODE: Account address: {self.__acc_address}\", flush=True)\n\n # return generated account\n return acc\n\n def __initialize_web3(self):\n \"\"\"\n Initializes Web3 object and configures it for PoA protocol\n Returns: Web3 object\n\n \"\"\"\n\n # initialize Web3 object with ip of non-validator node\n web3 = Web3(Web3.HTTPProvider(self.__rpc_url, request_kwargs={\"timeout\": 20})) # 10\n\n # inject Proof-of-Authority settings to object\n web3.middleware_onion.inject(geth_poa_middleware, layer=0)\n\n # automatically sign transactions if available for execution\n web3.middleware_onion.add(construct_sign_and_send_raw_middleware(self.__acc))\n\n # inject local account as default\n web3.eth.default_account = self.__acc_address\n\n # return initialized object for executing transaction\n return web3\n\n @retry((Exception, requests.exceptions.HTTPError), tries=20, delay=4)\n def __wait_for_blockchain(self) -> None:\n \"\"\"\n Request state of blockchain from Oracle by periodic calls and sleep\n Returns: None\n\n \"\"\"\n\n # check with oracle if blockchain is ready for requests\n response = requests.get(\n url=f\"{self.__oracle_url}/status\",\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n return print(\"ORACLE: Blockchain is ready\", flush=True)\n\n @retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\n def __request_funds_from_oracle(self) -> None:\n \"\"\"\n Requests funds from Oracle by sending public address\n Returns: None\n\n \"\"\"\n\n # call oracle's faucet by Http post request\n response = requests.post(\n url=f\"{self.__oracle_url}/faucet\",\n json={\"address\": self.__acc_address},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n return print(\"ORACLE: Received 500 ETH\", flush=True)\n\n def verify_balance(self) -> None:\n \"\"\"\n Calls blockchain directly for requesting current balance\n Returns: None\n\n \"\"\"\n\n # directly call view method from non-validator node\n balance = self.__web3.eth.get_balance(self.__acc_address, \"latest\")\n\n # convert wei to ether\n balance_eth = self.__web3.from_wei(balance, \"ether\")\n print(\n f\"BLOCKCHAIN: Successfully verified balance of {balance_eth} ETH\",\n flush=True,\n )\n\n # if node ran out of funds, it requests ether from the oracle\n if balance_eth <= 1:\n self.__request_funds_from_oracle()\n\n return None\n\n @retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\n def __get_contract_from_oracle(self):\n \"\"\"\n Requests header file and contract address, generates Web3 Contract object with it\n Returns: Web3 Contract object\n\n \"\"\"\n\n response = requests.get(\n url=f\"{self.__oracle_url}/contract\",\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n # convert response to json to extract the abi and address\n json_response = response.json()\n\n print(\n f\"ORACLE: Initialized chain code: {json_response.get('address')}\",\n flush=True,\n )\n\n # return an initialized web3 contract object\n return self.__web3.eth.contract(abi=json_response.get(\"abi\"), address=json_response.get(\"address\"))\n\n @retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\n def report_gas_oracle(self) -> list:\n \"\"\"\n Reports accumulated gas costs of all transactions made to the blockchain\n Returns: List of all accumulated gas costs per registered node\n\n \"\"\"\n\n # method used for experiments, not needed for aggregation\n response = requests.post(\n url=f\"{self.__oracle_url}/gas\",\n json={\"amount\": self.__gas_used, \"round\": self.round},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n # reset local gas accumulation\n self.__gas_used = 0\n\n # return list with gas usage for logging\n return list(response.json().items())\n\n @retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\n def report_reputation_oracle(self, records: list) -> None:\n \"\"\"\n Reports reputations used for aggregation\n Returns: None\n\n \"\"\"\n\n # method used for experiments, not needed for aggregation\n response = requests.post(\n url=f\"{self.__oracle_url}/reputation\",\n json={\"records\": records, \"round\": self.round, \"sender\": self.__home_ip},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n return None\n\n def __sign_and_deploy(self, trx_hash):\n \"\"\"\n Signs a function call to the chain code with the primary key and awaits the receipt\n Args:\n trx_hash: Transformed dictionary of all properties relevant for call to chain code\n\n Returns: transaction receipt confirming the successful write to the ledger\n\n \"\"\"\n\n # transaction is signed with private key\n signed_transaction = self.__web3.eth.account.sign_transaction(trx_hash, private_key=self.__private_key)\n\n # confirmation that transaction was passed from non-validator node to validator nodes\n executed_transaction = self.__web3.eth.send_raw_transaction(signed_transaction.rawTransaction)\n\n # non-validator node awaited the successful validation by validation nodes and returns receipt\n transaction_receipt = self.__web3.eth.wait_for_transaction_receipt(executed_transaction)\n\n # accumulate used gas\n self.__gas_used += transaction_receipt.gasUsed\n\n return transaction_receipt\n\n @retry(Exception, tries=3, delay=4)\n def push_opinions(self, opinion_dict: dict):\n \"\"\"\n Pushes all locally computed opinions of models to aggregate to the reputation system\n Args:\n opinion_dict: Dict of all names:opinions for writing to the reputation system\n\n Returns: Json of transaction receipt\n\n \"\"\"\n\n # create raw transaction object to call rate_neighbors() from the reputation system\n unsigned_trx = self.__contract_obj.functions.rate_neighbours(list(opinion_dict.items())).build_transaction({\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.__acc_address,\n \"nonce\": self.__web3.eth.get_transaction_count(\n self.__web3.to_checksum_address(self.__acc_address), \"pending\"\n ),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price, \"gwei\"),\n })\n\n # sign and execute the transaction\n conf = self.__sign_and_deploy(unsigned_trx)\n\n self.report_reputation_oracle(list(opinion_dict.items()))\n # return the receipt as json\n return self.__web3.to_json(conf)\n\n @retry(Exception, tries=3, delay=4)\n def get_reputations(self, ip_addresses: list) -> dict:\n \"\"\"\n Requests globally aggregated opinions values from reputation system for computing aggregation weights\n Args:\n ip_addresses: Names of nodes of which the reputation values should be generated\n\n Returns: Dictionary of name:reputation from the reputation system\n\n \"\"\"\n\n final_reputations = dict()\n stats_to_print = list()\n\n # call get_reputations() from reputation system\n raw_reputation = self.__contract_obj.functions.get_reputations(ip_addresses).call({\"from\": self.__acc_address})\n\n # loop list with tuples from reputation system\n for (\n name,\n reputation,\n weighted_reputation,\n stddev_count,\n divisor,\n final_reputation,\n avg,\n median,\n stddev,\n index,\n avg_deviation,\n avg_avg_deviation,\n malicious_opinions,\n ) in raw_reputation:\n # list elements with an empty name can be ignored\n if not name:\n continue\n\n # print statistical values\n stats_to_print.append([\n name,\n reputation / 10,\n weighted_reputation / 10,\n stddev_count / 10,\n divisor / 10,\n final_reputation / 10,\n avg / 10,\n median / 10,\n stddev / 10,\n avg_deviation / 10,\n avg_avg_deviation / 10,\n malicious_opinions,\n ])\n\n # assign the final reputation to a dict for later aggregation\n final_reputations[name] = final_reputation / 10\n\n print_table(\n \"REPUTATION SYSTEM STATE\",\n stats_to_print,\n [\n \"Name\",\n \"Reputation\",\n \"Weighted Rep. by local Node\",\n \"Stddev Count\",\n \"Divisor\",\n \"Final Reputation\",\n \"Mean\",\n \"Median\",\n \"Stddev\",\n \"Avg Deviation in Opinion\",\n \"Avg of all Avg Deviations in Opinions\",\n \"Malicious Opinions\",\n ],\n )\n\n # if sum(final_reputations.values()):\n # self.report_reputation_oracle(list(final_reputations.items()))\n\n return final_reputations\n\n @retry(Exception, tries=3, delay=4)\n def __register(self) -> str:\n \"\"\"\n Registers a node's name with its public address, signed with private key\n Returns: Json of transaction receipt\n\n \"\"\"\n\n # build raw transaction object to call public method register() from reputation system\n unsigned_trx = self.__contract_obj.functions.register(self.__home_ip).build_transaction({\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.__acc_address,\n \"nonce\": self.__web3.eth.get_transaction_count(\n self.__web3.to_checksum_address(self.__acc_address), \"pending\"\n ),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price, \"gwei\"),\n })\n\n # sign and execute created transaction\n conf = self.__sign_and_deploy(unsigned_trx)\n\n # return the receipt as json\n return self.__web3.to_json(conf)\n\n @retry(Exception, tries=3, delay=4)\n def verify_registration(self) -> None:\n \"\"\"\n Verifies the successful registration of the node itself,\n executes registration again if reputation system returns false\n Returns: None\n\n \"\"\"\n\n # call view function of reputation system to check if registration was not abandoned by hard fork\n confirmation = self.__contract_obj.functions.confirm_registration().call({\"from\": self.__acc_address})\n\n # function returns boolean\n if not confirmation:\n # register again if not successful\n self.__register()\n\n # raise Exception to check again\n raise Exception(\"EXCEPTION: _verify_registration() => Could not be confirmed)\")\n\n return None\n\n @retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\n def report_time_oracle(self, start: float) -> None:\n \"\"\"\n Reports time used for aggregation\n Returns: None\n\n \"\"\"\n # method used for experiments, not needed for aggregation\n # report aggregation time and round to oracle\n response = requests.post(\n url=f\"{BlockchainHandler.oracle_url}/time\",\n json={\"time\": (time.time_ns() - start) / (10**9), \"round\": self.round},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n # increase aggregation round counter after reporting time\n self.round += 1\n return None\n
Generates randomized primary key and derives public account from it Returns: None
def __create_account(self):\n \"\"\"\n Generates randomized primary key and derives public account from it\n Returns: None\n\n \"\"\"\n print(f\"{'-' * 25} REGISTER WORKING NODE {'-' * 25}\", flush=True)\n\n # generate random private key, address, public address\n acc = Account.create()\n\n # initialize web3 utility object\n web3 = Web3()\n\n # convert private key to hex, used in raw transactions\n self.__private_key = web3.to_hex(acc.key)\n\n # convert address type, used in raw transactions\n self.__acc_address = web3.to_checksum_address(acc.address)\n\n print(f\"WORKER NODE: Registered account: {self.__home_ip}\", flush=True)\n print(f\"WORKER NODE: Account address: {self.__acc_address}\", flush=True)\n\n # return generated account\n return acc\n
__get_contract_from_oracle()
Requests header file and contract address, generates Web3 Contract object with it Returns: Web3 Contract object
@retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\ndef __get_contract_from_oracle(self):\n \"\"\"\n Requests header file and contract address, generates Web3 Contract object with it\n Returns: Web3 Contract object\n\n \"\"\"\n\n response = requests.get(\n url=f\"{self.__oracle_url}/contract\",\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n # convert response to json to extract the abi and address\n json_response = response.json()\n\n print(\n f\"ORACLE: Initialized chain code: {json_response.get('address')}\",\n flush=True,\n )\n\n # return an initialized web3 contract object\n return self.__web3.eth.contract(abi=json_response.get(\"abi\"), address=json_response.get(\"address\"))\n
def __initialize_web3(self):\n \"\"\"\n Initializes Web3 object and configures it for PoA protocol\n Returns: Web3 object\n\n \"\"\"\n\n # initialize Web3 object with ip of non-validator node\n web3 = Web3(Web3.HTTPProvider(self.__rpc_url, request_kwargs={\"timeout\": 20})) # 10\n\n # inject Proof-of-Authority settings to object\n web3.middleware_onion.inject(geth_poa_middleware, layer=0)\n\n # automatically sign transactions if available for execution\n web3.middleware_onion.add(construct_sign_and_send_raw_middleware(self.__acc))\n\n # inject local account as default\n web3.eth.default_account = self.__acc_address\n\n # return initialized object for executing transaction\n return web3\n
__register()
Registers a node's name with its public address, signed with private key Returns: Json of transaction receipt
@retry(Exception, tries=3, delay=4)\ndef __register(self) -> str:\n \"\"\"\n Registers a node's name with its public address, signed with private key\n Returns: Json of transaction receipt\n\n \"\"\"\n\n # build raw transaction object to call public method register() from reputation system\n unsigned_trx = self.__contract_obj.functions.register(self.__home_ip).build_transaction({\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.__acc_address,\n \"nonce\": self.__web3.eth.get_transaction_count(\n self.__web3.to_checksum_address(self.__acc_address), \"pending\"\n ),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price, \"gwei\"),\n })\n\n # sign and execute created transaction\n conf = self.__sign_and_deploy(unsigned_trx)\n\n # return the receipt as json\n return self.__web3.to_json(conf)\n
__request_funds_from_oracle()
Requests funds from Oracle by sending public address Returns: None
@retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\ndef __request_funds_from_oracle(self) -> None:\n \"\"\"\n Requests funds from Oracle by sending public address\n Returns: None\n\n \"\"\"\n\n # call oracle's faucet by Http post request\n response = requests.post(\n url=f\"{self.__oracle_url}/faucet\",\n json={\"address\": self.__acc_address},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n return print(\"ORACLE: Received 500 ETH\", flush=True)\n
def __sign_and_deploy(self, trx_hash):\n \"\"\"\n Signs a function call to the chain code with the primary key and awaits the receipt\n Args:\n trx_hash: Transformed dictionary of all properties relevant for call to chain code\n\n Returns: transaction receipt confirming the successful write to the ledger\n\n \"\"\"\n\n # transaction is signed with private key\n signed_transaction = self.__web3.eth.account.sign_transaction(trx_hash, private_key=self.__private_key)\n\n # confirmation that transaction was passed from non-validator node to validator nodes\n executed_transaction = self.__web3.eth.send_raw_transaction(signed_transaction.rawTransaction)\n\n # non-validator node awaited the successful validation by validation nodes and returns receipt\n transaction_receipt = self.__web3.eth.wait_for_transaction_receipt(executed_transaction)\n\n # accumulate used gas\n self.__gas_used += transaction_receipt.gasUsed\n\n return transaction_receipt\n
__wait_for_blockchain()
Request state of blockchain from Oracle by periodic calls and sleep Returns: None
@retry((Exception, requests.exceptions.HTTPError), tries=20, delay=4)\ndef __wait_for_blockchain(self) -> None:\n \"\"\"\n Request state of blockchain from Oracle by periodic calls and sleep\n Returns: None\n\n \"\"\"\n\n # check with oracle if blockchain is ready for requests\n response = requests.get(\n url=f\"{self.__oracle_url}/status\",\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n return print(\"ORACLE: Blockchain is ready\", flush=True)\n
get_reputations(ip_addresses)
Requests globally aggregated opinions values from reputation system for computing aggregation weights Args: ip_addresses: Names of nodes of which the reputation values should be generated
Returns: Dictionary of name:reputation from the reputation system
@retry(Exception, tries=3, delay=4)\ndef get_reputations(self, ip_addresses: list) -> dict:\n \"\"\"\n Requests globally aggregated opinions values from reputation system for computing aggregation weights\n Args:\n ip_addresses: Names of nodes of which the reputation values should be generated\n\n Returns: Dictionary of name:reputation from the reputation system\n\n \"\"\"\n\n final_reputations = dict()\n stats_to_print = list()\n\n # call get_reputations() from reputation system\n raw_reputation = self.__contract_obj.functions.get_reputations(ip_addresses).call({\"from\": self.__acc_address})\n\n # loop list with tuples from reputation system\n for (\n name,\n reputation,\n weighted_reputation,\n stddev_count,\n divisor,\n final_reputation,\n avg,\n median,\n stddev,\n index,\n avg_deviation,\n avg_avg_deviation,\n malicious_opinions,\n ) in raw_reputation:\n # list elements with an empty name can be ignored\n if not name:\n continue\n\n # print statistical values\n stats_to_print.append([\n name,\n reputation / 10,\n weighted_reputation / 10,\n stddev_count / 10,\n divisor / 10,\n final_reputation / 10,\n avg / 10,\n median / 10,\n stddev / 10,\n avg_deviation / 10,\n avg_avg_deviation / 10,\n malicious_opinions,\n ])\n\n # assign the final reputation to a dict for later aggregation\n final_reputations[name] = final_reputation / 10\n\n print_table(\n \"REPUTATION SYSTEM STATE\",\n stats_to_print,\n [\n \"Name\",\n \"Reputation\",\n \"Weighted Rep. by local Node\",\n \"Stddev Count\",\n \"Divisor\",\n \"Final Reputation\",\n \"Mean\",\n \"Median\",\n \"Stddev\",\n \"Avg Deviation in Opinion\",\n \"Avg of all Avg Deviations in Opinions\",\n \"Malicious Opinions\",\n ],\n )\n\n # if sum(final_reputations.values()):\n # self.report_reputation_oracle(list(final_reputations.items()))\n\n return final_reputations\n
push_opinions(opinion_dict)
Pushes all locally computed opinions of models to aggregate to the reputation system Args: opinion_dict: Dict of all names:opinions for writing to the reputation system
Returns: Json of transaction receipt
@retry(Exception, tries=3, delay=4)\ndef push_opinions(self, opinion_dict: dict):\n \"\"\"\n Pushes all locally computed opinions of models to aggregate to the reputation system\n Args:\n opinion_dict: Dict of all names:opinions for writing to the reputation system\n\n Returns: Json of transaction receipt\n\n \"\"\"\n\n # create raw transaction object to call rate_neighbors() from the reputation system\n unsigned_trx = self.__contract_obj.functions.rate_neighbours(list(opinion_dict.items())).build_transaction({\n \"chainId\": self.__web3.eth.chain_id,\n \"from\": self.__acc_address,\n \"nonce\": self.__web3.eth.get_transaction_count(\n self.__web3.to_checksum_address(self.__acc_address), \"pending\"\n ),\n \"gasPrice\": self.__web3.to_wei(self.__gas_price, \"gwei\"),\n })\n\n # sign and execute the transaction\n conf = self.__sign_and_deploy(unsigned_trx)\n\n self.report_reputation_oracle(list(opinion_dict.items()))\n # return the receipt as json\n return self.__web3.to_json(conf)\n
report_gas_oracle()
Reports accumulated gas costs of all transactions made to the blockchain Returns: List of all accumulated gas costs per registered node
@retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\ndef report_gas_oracle(self) -> list:\n \"\"\"\n Reports accumulated gas costs of all transactions made to the blockchain\n Returns: List of all accumulated gas costs per registered node\n\n \"\"\"\n\n # method used for experiments, not needed for aggregation\n response = requests.post(\n url=f\"{self.__oracle_url}/gas\",\n json={\"amount\": self.__gas_used, \"round\": self.round},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n # reset local gas accumulation\n self.__gas_used = 0\n\n # return list with gas usage for logging\n return list(response.json().items())\n
report_reputation_oracle(records)
Reports reputations used for aggregation Returns: None
@retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\ndef report_reputation_oracle(self, records: list) -> None:\n \"\"\"\n Reports reputations used for aggregation\n Returns: None\n\n \"\"\"\n\n # method used for experiments, not needed for aggregation\n response = requests.post(\n url=f\"{self.__oracle_url}/reputation\",\n json={\"records\": records, \"round\": self.round, \"sender\": self.__home_ip},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n return None\n
report_time_oracle(start)
Reports time used for aggregation Returns: None
@retry((Exception, requests.exceptions.HTTPError), tries=3, delay=4)\ndef report_time_oracle(self, start: float) -> None:\n \"\"\"\n Reports time used for aggregation\n Returns: None\n\n \"\"\"\n # method used for experiments, not needed for aggregation\n # report aggregation time and round to oracle\n response = requests.post(\n url=f\"{BlockchainHandler.oracle_url}/time\",\n json={\"time\": (time.time_ns() - start) / (10**9), \"round\": self.round},\n headers=self.__rest_header,\n timeout=20, # 10\n )\n\n # raise Exception if status is not successful\n response.raise_for_status()\n\n # increase aggregation round counter after reporting time\n self.round += 1\n return None\n
verify_balance()
Calls blockchain directly for requesting current balance Returns: None
def verify_balance(self) -> None:\n \"\"\"\n Calls blockchain directly for requesting current balance\n Returns: None\n\n \"\"\"\n\n # directly call view method from non-validator node\n balance = self.__web3.eth.get_balance(self.__acc_address, \"latest\")\n\n # convert wei to ether\n balance_eth = self.__web3.from_wei(balance, \"ether\")\n print(\n f\"BLOCKCHAIN: Successfully verified balance of {balance_eth} ETH\",\n flush=True,\n )\n\n # if node ran out of funds, it requests ether from the oracle\n if balance_eth <= 1:\n self.__request_funds_from_oracle()\n\n return None\n
verify_registration()
Verifies the successful registration of the node itself, executes registration again if reputation system returns false Returns: None
@retry(Exception, tries=3, delay=4)\ndef verify_registration(self) -> None:\n \"\"\"\n Verifies the successful registration of the node itself,\n executes registration again if reputation system returns false\n Returns: None\n\n \"\"\"\n\n # call view function of reputation system to check if registration was not abandoned by hard fork\n confirmation = self.__contract_obj.functions.confirm_registration().call({\"from\": self.__acc_address})\n\n # function returns boolean\n if not confirmation:\n # register again if not successful\n self.__register()\n\n # raise Exception to check again\n raise Exception(\"EXCEPTION: _verify_registration() => Could not be confirmed)\")\n\n return None\n
BlockchainReputation
Bases: Aggregator
Aggregator
Weighted FedAvg by using relative reputation of each model's trainer Returns: aggregated model
class BlockchainReputation(Aggregator):\n \"\"\"\n # BAT-SandrinHunkeler (BlockchainReputation)\n Weighted FedAvg by using relative reputation of each model's trainer\n Returns: aggregated model\n \"\"\"\n\n ALGORITHM_MAP = {\n \"Cossim\": cosine_metric,\n \"Pearson\": pearson_correlation_metric,\n \"Euclidean\": euclidean_metric,\n \"Minkowski\": minkowski_metric,\n \"Manhattan\": manhattan_metric,\n \"Jaccard\": jaccard_metric,\n \"CossimEuclid\": cossim_euclidean,\n }\n\n def __init__(self, similarity_metric: str = \"CossimEuclid\", config=None, **kwargs):\n # initialize parent class\n super().__init__(config, **kwargs)\n\n self.config = config\n\n # extract local NEBULA name\n self.node_name = self.config.participant[\"network_args\"][\"addr\"]\n\n # initialize BlockchainHandler for interacting with oracle and non-validator node\n self.__blockchain = BlockchainHandler(self.node_name)\n\n # check if node is malicious for debugging\n self.__malicious = self.config.participant[\"device_args\"][\"malicious\"]\n\n self.__opinion_algo = BlockchainReputation.ALGORITHM_MAP.get(similarity_metric)\n self.__similarity_metric = similarity_metric\n\n def run_aggregation(self, model_buffer: OrderedDict[str, OrderedDict[torch.Tensor, int]]) -> torch.Tensor:\n print_with_frame(\"BLOCKCHAIN AGGREGATION: START\")\n\n # track aggregation time for experiments\n start = time.time_ns()\n\n # verify the registration process during initialization of the BlockchainHandler\n self.__blockchain.verify_registration()\n\n # verify if ether balance is still sufficient for aggregating, request more otherwise\n self.__blockchain.verify_balance()\n\n # create dict<sender, model>\n current_models = {sender: model for sender, (model, weight) in model_buffer.items()}\n\n print(f\"Node: {self.node_name}\", flush=True)\n print(f\"self.__malicious: {self.__malicious}\", flush=True)\n\n # extract local model from models to aggregate\n local_model = model_buffer[self.node_name][0]\n\n # compute similarity between local model and all buffered models\n metric_values = {\n sender: max(\n min(\n round(\n self.__opinion_algo(local_model, current_models[sender], similarity=True),\n 5,\n ),\n 1,\n ),\n 0,\n )\n for sender in current_models\n if sender != self.node_name\n }\n\n # log similarity metric values\n print_table(\n \"SIMILARITY METRIC\",\n list(metric_values.items()),\n [\"neighbour Node\", f\"{self.__similarity_metric} Similarity\"],\n )\n\n # increase resolution of metric in upper half of interval\n opinion_values = {sender: round(metric**3 * 100) for sender, metric in metric_values.items()}\n\n # DEBUG\n if int(self.node_name[-7]) <= 1 and self.__blockchain.round >= 5:\n opinion_values = {sender: int(torch.randint(0, 101, (1,))[0]) for sender, metric in metric_values.items()}\n\n # push local opinions to reputation system\n self.__blockchain.push_opinions(opinion_values)\n\n # log pushed opinion values\n print_table(\n \"REPORT LOCAL OPINION\",\n list(opinion_values.items()),\n [\"Node\", f\"Transformed {self.__similarity_metric} Similarity\"],\n )\n\n # request global reputation values from reputation system\n reputation_values = self.__blockchain.get_reputations([sender for sender in current_models])\n\n # log received global reputations\n print_table(\n \"GLOBAL REPUTATION\",\n list(reputation_values.items()),\n [\"Node\", \"Global Reputation\"],\n )\n\n # normalize all reputation values to sum() == 1\n sum_reputations = sum(reputation_values.values())\n if sum_reputations > 0:\n normalized_reputation_values = {\n name: round(reputation_values[name] / sum_reputations, 3) for name in reputation_values\n }\n else:\n normalized_reputation_values = reputation_values\n\n # log normalized aggregation weights\n print_table(\n \"AGGREGATION WEIGHTS\",\n list(normalized_reputation_values.items()),\n [\"Node\", \"Aggregation Weight\"],\n )\n\n # initialize empty model\n final_model = {layer: torch.zeros_like(param).float() for layer, param in local_model.items()}\n\n # cover rare case where no models were added or reputation is 0 to return the local model\n if sum_reputations > 0:\n for sender in normalized_reputation_values.keys():\n for layer in final_model:\n final_model[layer] += current_models[sender][layer].float() * normalized_reputation_values[sender]\n\n # otherwise, skip aggregation\n else:\n final_model = local_model\n\n # report used gas to oracle and log cumulative gas used\n print_table(\n \"TOTAL GAS USED\",\n self.__blockchain.report_gas_oracle(),\n [\"Node\", \"Cumulative Gas used\"],\n )\n self.__blockchain.report_time_oracle(start)\n\n print_with_frame(\"BLOCKCHAIN AGGREGATION: FINISHED\")\n\n # return newly aggregated model\n return final_model\n
print_table(title, values, headers)
Prints a title, all values ordered in a table, with the headers as column titles. Args: title: Title of the table values: Rows of table headers: Column headers of table
Returns: None, prints output
def print_table(title: str, values: list[tuple | list], headers: list[str]) -> None:\n \"\"\"\n Prints a title, all values ordered in a table, with the headers as column titles.\n Args:\n title: Title of the table\n values: Rows of table\n headers: Column headers of table\n\n Returns: None, prints output\n\n \"\"\"\n print(f\"\\n{'-' * 25} {title.upper()} {'-' * 25}\", flush=True)\n print(tabulate(sorted(values), headers=headers, tablefmt=\"grid\"), flush=True)\n
print_with_frame(message)
Prints a large frame with a title inside Args: message: Title to put into the frame
def print_with_frame(message) -> None:\n \"\"\"\n Prints a large frame with a title inside\n Args:\n message: Title to put into the frame\n\n Returns: None\n\n \"\"\"\n message_length = len(message)\n print(f\"{' ' * 20}+{'-' * (message_length + 2)}+\", flush=True)\n print(f\"{'*' * 20}| {message.upper()} |{'*' * 20}\", flush=True)\n print(f\"{' ' * 20}+{'-' * (message_length + 2)}+\", flush=True)\n
DualHistAgg
Aggregator: Dual History Aggregation (DualHistAgg) Authors: Enrique et al. Year: 2024
nebula/core/aggregation/dualhistagg.py
class DualHistAgg(Aggregator):\n \"\"\"\n Aggregator: Dual History Aggregation (DualHistAgg)\n Authors: Enrique et al.\n Year: 2024\n \"\"\"\n\n def __init__(self, config=None, **kwargs):\n super().__init__(config, **kwargs)\n\n def softmax(self, x):\n # Safeguard against empty input array\n if x.size == 0:\n return np.array([])\n e_x = np.exp(x - np.max(x))\n return e_x / e_x.sum(axis=0) # ensure division is done correctly\n\n def run_aggregation(self, models, reference_model=None):\n if len(models) == 0:\n logging.error(\"Trying to aggregate models when there are no models\")\n return None, None\n\n models = list(models.values())\n num_models = len(models)\n logging.info(f\"Number of models: {num_models}\")\n\n if num_models == 1:\n logging.info(\"Only one model, returning it\")\n return models[0][0], models[0][0]\n\n # Total Samples\n total_samples = float(sum(w for _, w in models))\n # Create a Zero Model\n accum = {\n layer: torch.zeros_like(param).float() for layer, param in models[0][0].items()\n } # use first model for template\n accum_similarity = accum.copy()\n\n similarities = (\n [cosine_metric(model, reference_model) for model, _ in models] if reference_model else [1] * num_models\n )\n\n logging.info(f\"Similarities: {similarities}\")\n weights = self.softmax(np.array(similarities))\n logging.info(f\"Weights: {weights}\")\n\n # Aggregation process\n for (model, _), weight, sim_weight in zip(models, weights, similarities, strict=False):\n for layer in accum:\n accum[layer] += model[layer].float() * float(weight)\n accum_similarity[layer] += model[layer].float() * float(sim_weight)\n\n # Normalize aggregated models\n for layer in accum:\n accum[layer] /= total_samples\n accum_similarity[layer] /= total_samples\n\n return accum, accum_similarity\n
FedAvg
Aggregator: Federated Averaging (FedAvg) Authors: McMahan et al. Year: 2016
nebula/core/aggregation/fedavg.py
class FedAvg(Aggregator):\n \"\"\"\n Aggregator: Federated Averaging (FedAvg)\n Authors: McMahan et al.\n Year: 2016\n \"\"\"\n\n def __init__(self, config=None, **kwargs):\n super().__init__(config, **kwargs)\n\n def run_aggregation(self, models):\n super().run_aggregation(models)\n\n models = list(models.values())\n\n total_samples = float(sum(weight for _, weight in models))\n\n if total_samples == 0:\n raise ValueError(\"Total number of samples must be greater than zero.\")\n\n last_model_params = models[-1][0]\n accum = {layer: torch.zeros_like(param, dtype=torch.float32) for layer, param in last_model_params.items()}\n\n with torch.no_grad():\n for model_parameters, weight in models:\n normalized_weight = weight / total_samples\n for layer in accum:\n accum[layer].add_(\n model_parameters[layer].to(accum[layer].dtype),\n alpha=normalized_weight,\n )\n\n del models\n gc.collect()\n\n # self.print_model_size(accum)\n return accum\n
FedAvgSVM
Aggregator: Federated Averaging (FedAvg) Authors: McMahan et al. Year: 2016 Note: This is a modified version of FedAvg for SVMs.
nebula/core/aggregation/fedavgSVM.py
class FedAvgSVM(Aggregator):\n \"\"\"\n Aggregator: Federated Averaging (FedAvg)\n Authors: McMahan et al.\n Year: 2016\n Note: This is a modified version of FedAvg for SVMs.\n \"\"\"\n\n def __init__(self, config=None, **kwargs):\n super().__init__(config, **kwargs)\n\n def run_aggregation(self, models):\n super().run_aggregation(models)\n\n models = list(models.values())\n\n total_samples = sum([y for _, y in models])\n\n coeff_accum = np.zeros_like(models[-1][0].coef_)\n intercept_accum = 0.0\n\n for model, w in models:\n if not isinstance(model, LinearSVC):\n return None\n coeff_accum += model.coef_ * w\n intercept_accum += model.intercept_ * w\n\n coeff_accum /= total_samples\n intercept_accum /= total_samples\n\n aggregated_svm = LinearSVC()\n aggregated_svm.coef_ = coeff_accum\n aggregated_svm.intercept_ = intercept_accum\n\n return aggregated_svm\n
Krum
Aggregator: Krum Authors: Peva Blanchard et al. Year: 2017 Note: https://papers.nips.cc/paper/2017/hash/f4b9ec30ad9f68f89b29639786cb62ef-Abstract.html
nebula/core/aggregation/krum.py
class Krum(Aggregator):\n \"\"\"\n Aggregator: Krum\n Authors: Peva Blanchard et al.\n Year: 2017\n Note: https://papers.nips.cc/paper/2017/hash/f4b9ec30ad9f68f89b29639786cb62ef-Abstract.html\n \"\"\"\n\n def __init__(self, config=None, **kwargs):\n super().__init__(config, **kwargs)\n\n def run_aggregation(self, models):\n super().run_aggregation(models)\n\n models = list(models.values())\n\n accum = {layer: torch.zeros_like(param).float() for layer, param in models[-1][0].items()}\n total_models = len(models)\n distance_list = [0 for i in range(0, total_models)]\n min_index = 0\n min_distance_sum = float(\"inf\")\n\n for i in range(0, total_models):\n m1, _ = models[i]\n for j in range(0, total_models):\n m2, _ = models[j]\n distance = 0\n if i == j:\n distance = 0\n else:\n for layer in m1:\n l1 = m1[layer]\n\n l2 = m2[layer]\n distance += numpy.linalg.norm(l1 - l2)\n distance_list[i] += distance\n\n if min_distance_sum > distance_list[i]:\n min_distance_sum = distance_list[i]\n min_index = i\n m, _ = models[min_index]\n for layer in m:\n accum[layer] = accum[layer] + m[layer]\n\n return accum\n
Median
Aggregator: Median Authors: Dong Yin et al et al. Year: 2021 Note: https://arxiv.org/pdf/1803.01498.pdf
nebula/core/aggregation/median.py
class Median(Aggregator):\n \"\"\"\n Aggregator: Median\n Authors: Dong Yin et al et al.\n Year: 2021\n Note: https://arxiv.org/pdf/1803.01498.pdf\n \"\"\"\n\n def __init__(self, config=None, **kwargs):\n super().__init__(config, **kwargs)\n\n def get_median(self, weights):\n # check if the weight tensor has enough space\n weight_len = len(weights)\n\n median = 0\n if weight_len % 2 == 1:\n # odd number, return the median\n median, _ = torch.median(weights, 0)\n else:\n # even number, return the mean of median two numbers\n # sort the tensor\n arr_weights = np.asarray(weights)\n nobs = arr_weights.shape[0]\n start = int(nobs / 2) - 1\n end = int(nobs / 2) + 1\n atmp = np.partition(arr_weights, (start, end - 1), 0)\n sl = [slice(None)] * atmp.ndim\n sl[0] = slice(start, end)\n arr_median = np.mean(atmp[tuple(sl)], axis=0)\n median = torch.tensor(arr_median)\n return median\n\n def run_aggregation(self, models):\n super().run_aggregation(models)\n\n models = list(models.values())\n models_params = [m for m, _ in models]\n\n total_models = len(models)\n\n accum = {layer: torch.zeros_like(param).float() for layer, param in models[-1][0].items()}\n\n # Calculate the trimmedmean for each parameter\n for layer in accum:\n weight_layer = accum[layer]\n # get the shape of layer tensor\n l_shape = list(weight_layer.shape)\n\n # get the number of elements of layer tensor\n number_layer_weights = torch.numel(weight_layer)\n # if its 0-d tensor\n if l_shape == []:\n weights = torch.tensor([models_params[j][layer] for j in range(0, total_models)])\n weights = weights.double()\n w = self.get_median(weights)\n accum[layer] = w\n\n else:\n # flatten the tensor\n weight_layer_flatten = weight_layer.view(number_layer_weights)\n\n # flatten the tensor of each model\n models_layer_weight_flatten = torch.stack(\n [models_params[j][layer].view(number_layer_weights) for j in range(0, total_models)],\n 0,\n )\n\n # get the weight list [w1j,w2j,\u00b7\u00b7\u00b7 ,wmj], where wij is the jth parameter of the ith local model\n median = self.get_median(models_layer_weight_flatten)\n accum[layer] = median.view(l_shape)\n return accum\n
TrimmedMean
Aggregator: TrimmedMean Authors: Dong Yin et al et al. Year: 2021 Note: https://arxiv.org/pdf/1803.01498.pdf
nebula/core/aggregation/trimmedmean.py
class TrimmedMean(Aggregator):\n \"\"\"\n Aggregator: TrimmedMean\n Authors: Dong Yin et al et al.\n Year: 2021\n Note: https://arxiv.org/pdf/1803.01498.pdf\n \"\"\"\n\n def __init__(self, config=None, beta=0, **kwargs):\n super().__init__(config, **kwargs)\n self.beta = beta\n\n def get_trimmedmean(self, weights):\n # check if the weight tensor has enough space\n weight_len = len(weights)\n\n if weight_len <= 2 * self.beta:\n remaining_wrights = weights\n res = torch.mean(remaining_wrights, 0)\n\n else:\n # remove the largest and smallest \u03b2 items\n arr_weights = np.asarray(weights)\n nobs = arr_weights.shape[0]\n start = self.beta\n end = nobs - self.beta\n atmp = np.partition(arr_weights, (start, end - 1), 0)\n sl = [slice(None)] * atmp.ndim\n sl[0] = slice(start, end)\n print(atmp[tuple(sl)])\n arr_median = np.mean(atmp[tuple(sl)], axis=0)\n res = torch.tensor(arr_median)\n\n # get the mean of the remaining weights\n\n return res\n\n def run_aggregation(self, models):\n super().run_aggregation(models)\n\n models = list(models.values())\n models_params = [m for m, _ in models]\n\n total_models = len(models)\n\n accum = {layer: torch.zeros_like(param).float() for layer, param in models[-1][0].items()}\n\n for layer in accum:\n weight_layer = accum[layer]\n # get the shape of layer tensor\n l_shape = list(weight_layer.shape)\n\n # get the number of elements of layer tensor\n number_layer_weights = torch.numel(weight_layer)\n # if its 0-d tensor\n if l_shape == []:\n weights = torch.tensor([models_params[j][layer] for j in range(0, total_models)])\n weights = weights.double()\n w = self.get_trimmedmean(weights)\n accum[layer] = w\n\n else:\n # flatten the tensor\n weight_layer_flatten = weight_layer.view(number_layer_weights)\n\n # flatten the tensor of each model\n models_layer_weight_flatten = torch.stack(\n [models_params[j][layer].view(number_layer_weights) for j in range(0, total_models)],\n 0,\n )\n\n # get the weight list [w1j,w2j,\u00b7\u00b7\u00b7 ,wmj], where wij is the jth parameter of the ith local model\n trimmedmean = self.get_trimmedmean(models_layer_weight_flatten)\n accum[layer] = trimmedmean.view(l_shape)\n\n return accum\n
NebulaDataset
Bases: Dataset, ABC
Dataset
ABC
Abstract class for a partitioned dataset.
Classes inheriting from this class need to implement specific methods for loading and partitioning the dataset.
nebula/core/datasets/nebuladataset.py
class NebulaDataset(Dataset, ABC):\n \"\"\"\n Abstract class for a partitioned dataset.\n\n Classes inheriting from this class need to implement specific methods\n for loading and partitioning the dataset.\n \"\"\"\n\n def __init__(\n self,\n num_classes=10,\n partition_id=0,\n partitions_number=1,\n batch_size=32,\n num_workers=4,\n iid=True,\n partition=\"dirichlet\",\n partition_parameter=0.5,\n seed=42,\n config=None,\n ):\n super().__init__()\n\n if partition_id < 0 or partition_id >= partitions_number:\n raise ValueError(f\"partition_id {partition_id} is out of range for partitions_number {partitions_number}\")\n\n self.num_classes = num_classes\n self.partition_id = partition_id\n self.partitions_number = partitions_number\n self.batch_size = batch_size\n self.num_workers = num_workers\n self.iid = iid\n self.partition = partition\n self.partition_parameter = partition_parameter\n self.seed = seed\n self.config = config\n\n self.train_set = None\n self.train_indices_map = None\n self.test_set = None\n self.test_indices_map = None\n\n # Classes of the participants to be sure that the same classes are used in training and testing\n self.class_distribution = None\n\n enable_deterministic(config)\n\n if self.partition_id == 0:\n self.initialize_dataset()\n else:\n max_tries = 10\n for i in range(max_tries):\n try:\n self.initialize_dataset()\n break\n except Exception as e:\n logging_training.info(f\"Error loading dataset: {e}. Retrying {i + 1}/{max_tries} in 5 seconds...\")\n time.sleep(5)\n\n @abstractmethod\n def initialize_dataset(self):\n \"\"\"\n Initialize the dataset. This should load or create the dataset.\n \"\"\"\n pass\n\n @abstractmethod\n def generate_non_iid_map(self, dataset, partition=\"dirichlet\", plot=False):\n \"\"\"\n Create a non-iid map of the dataset.\n \"\"\"\n pass\n\n @abstractmethod\n def generate_iid_map(self, dataset, plot=False):\n \"\"\"\n Create an iid map of the dataset.\n \"\"\"\n pass\n\n def get_train_labels(self):\n \"\"\"\n Get the labels of the training set based on the indices map.\n \"\"\"\n if self.train_indices_map is None:\n return None\n return [self.train_set.targets[idx] for idx in self.train_indices_map]\n\n def get_test_labels(self):\n \"\"\"\n Get the labels of the test set based on the indices map.\n \"\"\"\n if self.test_indices_map is None:\n return None\n return [self.test_set.targets[idx] for idx in self.test_indices_map]\n\n def get_local_test_labels(self):\n \"\"\"\n Get the labels of the local test set based on the indices map.\n \"\"\"\n if self.local_test_indices_map is None:\n return None\n return [self.test_set.targets[idx] for idx in self.local_test_indices_map]\n\n def plot_data_distribution(self, dataset, partitions_map):\n \"\"\"\n Plot the data distribution of the dataset.\n\n Plot the data distribution of the dataset according to the partitions map provided.\n\n Args:\n dataset: The dataset to plot (torch.utils.data.Dataset).\n partitions_map: The map of the dataset partitions.\n \"\"\"\n # Plot the data distribution of the dataset, one graph per partition\n sns.set()\n sns.set_style(\"whitegrid\", {\"axes.grid\": False})\n sns.set_context(\"paper\", font_scale=1.5)\n sns.set_palette(\"Set2\")\n\n for i in range(self.partitions_number):\n indices = partitions_map[i]\n class_counts = [0] * self.num_classes\n for idx in indices:\n label = dataset.targets[idx]\n class_counts[label] += 1\n logging_training.info(f\"Participant {i + 1} class distribution: {class_counts}\")\n plt.figure()\n plt.bar(range(self.num_classes), class_counts)\n plt.xlabel(\"Class\")\n plt.ylabel(\"Number of samples\")\n plt.xticks(range(self.num_classes))\n if self.iid:\n plt.title(f\"Participant {i + 1} class distribution (IID)\")\n else:\n plt.title(\n f\"Participant {i + 1} class distribution (Non-IID - {self.partition}) - {self.partition_parameter}\"\n )\n plt.tight_layout()\n path_to_save = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/participant_{i}_class_distribution_{'iid' if self.iid else 'non_iid'}{'_' + self.partition if not self.iid else ''}.png\"\n plt.savefig(path_to_save, dpi=300, bbox_inches=\"tight\")\n plt.close()\n\n plt.figure()\n max_point_size = 500\n min_point_size = 0\n\n for i in range(self.partitions_number):\n class_counts = [0] * self.num_classes\n indices = partitions_map[i]\n for idx in indices:\n label = dataset.targets[idx]\n class_counts[label] += 1\n\n # Normalize the point sizes for this partition\n max_samples_partition = max(class_counts)\n sizes = [\n (size / max_samples_partition) * (max_point_size - min_point_size) + min_point_size\n for size in class_counts\n ]\n plt.scatter([i] * self.num_classes, range(self.num_classes), s=sizes, alpha=0.5)\n\n plt.xlabel(\"Participant\")\n plt.ylabel(\"Class\")\n plt.xticks(range(self.partitions_number))\n plt.yticks(range(self.num_classes))\n if self.iid:\n plt.title(f\"Participant {i + 1} class distribution (IID)\")\n else:\n plt.title(\n f\"Participant {i + 1} class distribution (Non-IID - {self.partition}) - {self.partition_parameter}\"\n )\n plt.tight_layout()\n\n # Saves the distribution display with circles of different size\n path_to_save = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/class_distribution_{'iid' if self.iid else 'non_iid'}{'_' + self.partition if not self.iid else ''}.png\"\n plt.savefig(path_to_save, dpi=300, bbox_inches=\"tight\")\n plt.close()\n\n if hasattr(self, \"tsne\") and self.tsne:\n self.visualize_tsne(dataset)\n\n def visualize_tsne(self, dataset):\n X = [] # List for storing the characteristics of the samples\n y = [] # Ready to store the labels of the samples\n for idx in range(len(dataset)): # Assuming that 'dataset' is a list or array of your samples\n sample, label = dataset[idx]\n X.append(sample.flatten())\n y.append(label)\n\n X = np.array(X)\n y = np.array(y)\n\n tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)\n tsne_results = tsne.fit_transform(X)\n\n plt.figure(figsize=(16, 10))\n sns.scatterplot(\n x=tsne_results[:, 0],\n y=tsne_results[:, 1],\n hue=y,\n palette=sns.color_palette(\"hsv\", self.num_classes),\n legend=\"full\",\n alpha=0.7,\n )\n\n plt.title(\"t-SNE visualization of the dataset\")\n plt.xlabel(\"t-SNE axis 1\")\n plt.ylabel(\"t-SNE axis 2\")\n plt.legend(title=\"Class\")\n plt.tight_layout()\n\n path_to_save_tsne = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/tsne_visualization.png\"\n plt.savefig(path_to_save_tsne, dpi=300, bbox_inches=\"tight\")\n plt.close()\n\n def dirichlet_partition(self, dataset, alpha=0.5, min_samples_per_class=10):\n y_data = self._get_targets(dataset)\n unique_labels = np.unique(y_data)\n logging_training.info(f\"Labels unique: {unique_labels}\")\n num_samples = len(y_data)\n\n indices_per_partition = [[] for _ in range(self.partitions_number)]\n label_distribution = self.class_distribution if self.class_distribution is not None else None\n\n for label in unique_labels:\n label_indices = np.where(y_data == label)[0]\n np.random.shuffle(label_indices)\n\n if label_distribution is None:\n proportions = np.random.dirichlet([alpha] * self.partitions_number)\n else:\n proportions = label_distribution[label]\n\n proportions = self._adjust_proportions(proportions, indices_per_partition, num_samples)\n split_points = (np.cumsum(proportions) * len(label_indices)).astype(int)[:-1]\n\n for partition_idx, indices in enumerate(np.split(label_indices, split_points)):\n if len(indices) < min_samples_per_class:\n indices_per_partition[partition_idx].extend([])\n else:\n indices_per_partition[partition_idx].extend(indices)\n\n if label_distribution is None:\n self.class_distribution = self._calculate_class_distribution(indices_per_partition, y_data)\n\n return {i: indices for i, indices in enumerate(indices_per_partition)}\n\n def _adjust_proportions(self, proportions, indices_per_partition, num_samples):\n adjusted = np.array([\n p * (len(indices) < num_samples / self.partitions_number)\n for p, indices in zip(proportions, indices_per_partition, strict=False)\n ])\n return adjusted / adjusted.sum()\n\n def _calculate_class_distribution(self, indices_per_partition, y_data):\n distribution = defaultdict(lambda: np.zeros(self.partitions_number))\n for partition_idx, indices in enumerate(indices_per_partition):\n labels, counts = np.unique(y_data[indices], return_counts=True)\n for label, count in zip(labels, counts, strict=False):\n distribution[label][partition_idx] = count\n return {k: v / v.sum() for k, v in distribution.items()}\n\n @staticmethod\n def _get_targets(dataset) -> np.ndarray:\n if isinstance(dataset.targets, np.ndarray):\n return dataset.targets\n elif hasattr(dataset.targets, \"numpy\"):\n return dataset.targets.numpy()\n else:\n return np.asarray(dataset.targets)\n\n def homo_partition(self, dataset):\n \"\"\"\n Homogeneously partition the dataset into multiple subsets.\n\n This function divides a dataset into a specified number of subsets, where each subset\n is intended to have a roughly equal number of samples. This method aims to ensure a\n homogeneous distribution of data across all subsets. It's particularly useful in\n scenarios where a uniform distribution of data is desired among all federated learning\n clients.\n\n Args:\n dataset (torch.utils.data.Dataset): The dataset to partition. It should have\n 'data' and 'targets' attributes.\n\n Returns:\n dict: A dictionary where keys are subset indices (ranging from 0 to partitions_number-1)\n and values are lists of indices corresponding to the samples in each subset.\n\n The function randomly shuffles the entire dataset and then splits it into the number\n of subsets specified by `partitions_number`. It ensures that each subset has a similar number\n of samples. The function also prints the class distribution in each subset for reference.\n\n Example usage:\n federated_data = homo_partition(my_dataset)\n # This creates federated data subsets with homogeneous distribution.\n \"\"\"\n n_nets = self.partitions_number\n\n n_train = len(dataset.targets)\n np.random.seed(self.seed)\n idxs = np.random.permutation(n_train)\n batch_idxs = np.array_split(idxs, n_nets)\n net_dataidx_map = {i: batch_idxs[i] for i in range(n_nets)}\n\n # partitioned_datasets = []\n for i in range(self.partitions_number):\n # subset = torch.utils.data.Subset(dataset, net_dataidx_map[i])\n # partitioned_datasets.append(subset)\n\n # Print class distribution in the current partition\n class_counts = [0] * self.num_classes\n for idx in net_dataidx_map[i]:\n label = dataset.targets[idx]\n class_counts[label] += 1\n logging_training.info(f\"Partition {i + 1} class distribution: {class_counts}\")\n\n return net_dataidx_map\n\n def balanced_iid_partition(self, dataset):\n \"\"\"\n Partition the dataset into balanced and IID (Independent and Identically Distributed)\n subsets for each client.\n\n This function divides a dataset into a specified number of subsets (federated clients),\n where each subset has an equal class distribution. This makes the partition suitable for\n simulating IID data scenarios in federated learning.\n\n Args:\n dataset (list): The dataset to partition. It should be a list of tuples where each\n tuple represents a data sample and its corresponding label.\n\n Returns:\n dict: A dictionary where keys are client IDs (ranging from 0 to partitions_number-1) and\n values are lists of indices corresponding to the samples assigned to each client.\n\n The function ensures that each class is represented equally in each subset. The\n partitioning process involves iterating over each class, shuffling the indices of that class,\n and then splitting them equally among the clients. The function does not print the class\n distribution in each subset.\n\n Example usage:\n federated_data = balanced_iid_partition(my_dataset)\n # This creates federated data subsets with equal class distributions.\n \"\"\"\n num_clients = self.partitions_number\n clients_data = {i: [] for i in range(num_clients)}\n\n # Get the labels from the dataset\n if isinstance(dataset.targets, np.ndarray):\n labels = dataset.targets\n elif hasattr(dataset.targets, \"numpy\"): # Check if it's a tensor with .numpy() method\n labels = dataset.targets.numpy()\n else: # If it's a list\n labels = np.asarray(dataset.targets)\n\n label_counts = np.bincount(labels)\n min_label = label_counts.argmin()\n min_count = label_counts[min_label]\n\n for label in range(self.num_classes):\n # Get the indices of the same label samples\n label_indices = np.where(labels == label)[0]\n np.random.seed(self.seed)\n np.random.shuffle(label_indices)\n\n # Split the data based on their labels\n samples_per_client = min_count // num_clients\n\n for i in range(num_clients):\n start_idx = i * samples_per_client\n end_idx = (i + 1) * samples_per_client\n clients_data[i].extend(label_indices[start_idx:end_idx])\n\n return clients_data\n\n def unbalanced_iid_partition(self, dataset, imbalance_factor=2):\n \"\"\"\n Partition the dataset into multiple IID (Independent and Identically Distributed)\n subsets with different size.\n\n This function divides a dataset into a specified number of IID subsets (federated\n clients), where each subset has a different number of samples. The number of samples\n in each subset is determined by an imbalance factor, making the partition suitable\n for simulating imbalanced data scenarios in federated learning.\n\n Args:\n dataset (list): The dataset to partition. It should be a list of tuples where\n each tuple represents a data sample and its corresponding label.\n imbalance_factor (float): The factor to determine the degree of imbalance\n among the subsets. A lower imbalance factor leads to more\n imbalanced partitions.\n\n Returns:\n dict: A dictionary where keys are client IDs (ranging from 0 to partitions_number-1) and\n values are lists of indices corresponding to the samples assigned to each client.\n\n The function ensures that each class is represented in each subset but with varying\n proportions. The partitioning process involves iterating over each class, shuffling\n the indices of that class, and then splitting them according to the calculated subset\n sizes. The function does not print the class distribution in each subset.\n\n Example usage:\n federated_data = unbalanced_iid_partition(my_dataset, imbalance_factor=2)\n # This creates federated data subsets with varying number of samples based on\n # an imbalance factor of 2.\n \"\"\"\n num_clients = self.partitions_number\n clients_data = {i: [] for i in range(num_clients)}\n\n # Get the labels from the dataset\n labels = np.array([dataset.targets[idx] for idx in range(len(dataset))])\n label_counts = np.bincount(labels)\n\n min_label = label_counts.argmin()\n min_count = label_counts[min_label]\n\n # Set the initial_subset_size\n initial_subset_size = min_count // num_clients\n\n # Calculate the number of samples for each subset based on the imbalance factor\n subset_sizes = [initial_subset_size]\n for i in range(1, num_clients):\n subset_sizes.append(int(subset_sizes[i - 1] * ((imbalance_factor - 1) / imbalance_factor)))\n\n for label in range(self.num_classes):\n # Get the indices of the same label samples\n label_indices = np.where(labels == label)[0]\n np.random.seed(self.seed)\n np.random.shuffle(label_indices)\n\n # Split the data based on their labels\n start = 0\n for i in range(num_clients):\n end = start + subset_sizes[i]\n clients_data[i].extend(label_indices[start:end])\n start = end\n\n return clients_data\n\n def percentage_partition(self, dataset, percentage=20):\n \"\"\"\n Partition a dataset into multiple subsets with a specified level of non-IID-ness.\n\n This function divides a dataset into a specified number of subsets (federated\n clients), where each subset has a different class distribution. The class\n distribution in each subset is determined by a specified percentage, making the\n partition suitable for simulating non-IID (non-Independently and Identically\n Distributed) data scenarios in federated learning.\n\n Args:\n dataset (torch.utils.data.Dataset): The dataset to partition. It should have\n 'data' and 'targets' attributes.\n percentage (int): A value between 0 and 100 that specifies the desired\n level of non-IID-ness for the labels of the federated data.\n This percentage controls the imbalance in the class distribution\n across different subsets.\n\n Returns:\n dict: A dictionary where keys are subset indices (ranging from 0 to partitions_number-1)\n and values are lists of indices corresponding to the samples in each subset.\n\n The function ensures that the number of classes in each subset varies based on the selected\n percentage. The partitioning process involves iterating over each class, shuffling the\n indices of that class, and then splitting them according to the calculated subset sizes.\n The function also prints the class distribution in each subset for reference.\n\n Example usage:\n federated_data = percentage_partition(my_dataset, percentage=20)\n # This creates federated data subsets with varying class distributions based on\n # a percentage of 20.\n \"\"\"\n if isinstance(dataset.targets, np.ndarray):\n y_train = dataset.targets\n elif hasattr(dataset.targets, \"numpy\"): # Check if it's a tensor with .numpy() method\n y_train = dataset.targets.numpy()\n else: # If it's a list\n y_train = np.asarray(dataset.targets)\n\n num_classes = self.num_classes\n num_subsets = self.partitions_number\n class_indices = {i: np.where(y_train == i)[0] for i in range(num_classes)}\n\n # Get the labels from the dataset\n labels = np.array([dataset.targets[idx] for idx in range(len(dataset))])\n label_counts = np.bincount(labels)\n\n min_label = label_counts.argmin()\n min_count = label_counts[min_label]\n\n classes_per_subset = int(num_classes * percentage / 100)\n if classes_per_subset < 1:\n raise ValueError(\"The percentage is too low to assign at least one class to each subset.\")\n\n subset_indices = [[] for _ in range(num_subsets)]\n class_list = list(range(num_classes))\n np.random.seed(self.seed)\n np.random.shuffle(class_list)\n\n for i in range(num_subsets):\n for j in range(classes_per_subset):\n # Use modulo operation to cycle through the class_list\n class_idx = class_list[(i * classes_per_subset + j) % num_classes]\n indices = class_indices[class_idx]\n np.random.seed(self.seed)\n np.random.shuffle(indices)\n # Select approximately 50% of the indices\n subset_indices[i].extend(indices[: min_count // 2])\n\n class_counts = np.bincount(np.array([dataset.targets[idx] for idx in subset_indices[i]]))\n logging_training.info(f\"Partition {i + 1} class distribution: {class_counts.tolist()}\")\n\n partitioned_datasets = {i: subset_indices[i] for i in range(num_subsets)}\n\n return partitioned_datasets\n\n def plot_all_data_distribution(self, dataset, partitions_map):\n \"\"\"\n\n Plot all of the data distribution of the dataset according to the partitions map provided.\n\n Args:\n dataset: The dataset to plot (torch.utils.data.Dataset).\n partitions_map: The map of the dataset partitions.\n \"\"\"\n sns.set()\n sns.set_style(\"whitegrid\", {\"axes.grid\": False})\n sns.set_context(\"paper\", font_scale=1.5)\n sns.set_palette(\"Set2\")\n\n num_clients = len(partitions_map)\n num_classes = self.num_classes\n\n plt.figure(figsize=(12, 8))\n\n label_distribution = [[] for _ in range(num_classes)]\n for c_id, idc in partitions_map.items():\n for idx in idc:\n label_distribution[dataset.targets[idx]].append(c_id)\n\n plt.hist(\n label_distribution,\n stacked=True,\n bins=np.arange(-0.5, num_clients + 1.5, 1),\n label=dataset.classes,\n rwidth=0.5,\n )\n plt.xticks(\n np.arange(num_clients),\n [\"Participant %d\" % (c_id + 1) for c_id in range(num_clients)],\n )\n plt.title(\"Distribution of splited datasets\")\n plt.xlabel(\"Participant\")\n plt.ylabel(\"Number of samples\")\n plt.xticks(range(num_clients), [f\" {i}\" for i in range(num_clients)])\n plt.legend(loc=\"upper right\")\n plt.tight_layout()\n\n path_to_save = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/all_data_distribution_{'iid' if self.iid else 'non_iid'}{'_' + self.partition if not self.iid else ''}.png\"\n plt.savefig(path_to_save, dpi=300, bbox_inches=\"tight\")\n plt.close()\n
balanced_iid_partition(dataset)
Partition the dataset into balanced and IID (Independent and Identically Distributed) subsets for each client.
This function divides a dataset into a specified number of subsets (federated clients), where each subset has an equal class distribution. This makes the partition suitable for simulating IID data scenarios in federated learning.
The dataset to partition. It should be a list of tuples where each tuple represents a data sample and its corresponding label.
A dictionary where keys are client IDs (ranging from 0 to partitions_number-1) and values are lists of indices corresponding to the samples assigned to each client.
The function ensures that each class is represented equally in each subset. The partitioning process involves iterating over each class, shuffling the indices of that class, and then splitting them equally among the clients. The function does not print the class distribution in each subset.
federated_data = balanced_iid_partition(my_dataset)
def balanced_iid_partition(self, dataset):\n \"\"\"\n Partition the dataset into balanced and IID (Independent and Identically Distributed)\n subsets for each client.\n\n This function divides a dataset into a specified number of subsets (federated clients),\n where each subset has an equal class distribution. This makes the partition suitable for\n simulating IID data scenarios in federated learning.\n\n Args:\n dataset (list): The dataset to partition. It should be a list of tuples where each\n tuple represents a data sample and its corresponding label.\n\n Returns:\n dict: A dictionary where keys are client IDs (ranging from 0 to partitions_number-1) and\n values are lists of indices corresponding to the samples assigned to each client.\n\n The function ensures that each class is represented equally in each subset. The\n partitioning process involves iterating over each class, shuffling the indices of that class,\n and then splitting them equally among the clients. The function does not print the class\n distribution in each subset.\n\n Example usage:\n federated_data = balanced_iid_partition(my_dataset)\n # This creates federated data subsets with equal class distributions.\n \"\"\"\n num_clients = self.partitions_number\n clients_data = {i: [] for i in range(num_clients)}\n\n # Get the labels from the dataset\n if isinstance(dataset.targets, np.ndarray):\n labels = dataset.targets\n elif hasattr(dataset.targets, \"numpy\"): # Check if it's a tensor with .numpy() method\n labels = dataset.targets.numpy()\n else: # If it's a list\n labels = np.asarray(dataset.targets)\n\n label_counts = np.bincount(labels)\n min_label = label_counts.argmin()\n min_count = label_counts[min_label]\n\n for label in range(self.num_classes):\n # Get the indices of the same label samples\n label_indices = np.where(labels == label)[0]\n np.random.seed(self.seed)\n np.random.shuffle(label_indices)\n\n # Split the data based on their labels\n samples_per_client = min_count // num_clients\n\n for i in range(num_clients):\n start_idx = i * samples_per_client\n end_idx = (i + 1) * samples_per_client\n clients_data[i].extend(label_indices[start_idx:end_idx])\n\n return clients_data\n
generate_iid_map(dataset, plot=False)
abstractmethod
Create an iid map of the dataset.
@abstractmethod\ndef generate_iid_map(self, dataset, plot=False):\n \"\"\"\n Create an iid map of the dataset.\n \"\"\"\n pass\n
generate_non_iid_map(dataset, partition='dirichlet', plot=False)
Create a non-iid map of the dataset.
@abstractmethod\ndef generate_non_iid_map(self, dataset, partition=\"dirichlet\", plot=False):\n \"\"\"\n Create a non-iid map of the dataset.\n \"\"\"\n pass\n
get_local_test_labels()
Get the labels of the local test set based on the indices map.
def get_local_test_labels(self):\n \"\"\"\n Get the labels of the local test set based on the indices map.\n \"\"\"\n if self.local_test_indices_map is None:\n return None\n return [self.test_set.targets[idx] for idx in self.local_test_indices_map]\n
get_test_labels()
Get the labels of the test set based on the indices map.
def get_test_labels(self):\n \"\"\"\n Get the labels of the test set based on the indices map.\n \"\"\"\n if self.test_indices_map is None:\n return None\n return [self.test_set.targets[idx] for idx in self.test_indices_map]\n
get_train_labels()
Get the labels of the training set based on the indices map.
def get_train_labels(self):\n \"\"\"\n Get the labels of the training set based on the indices map.\n \"\"\"\n if self.train_indices_map is None:\n return None\n return [self.train_set.targets[idx] for idx in self.train_indices_map]\n
homo_partition(dataset)
Homogeneously partition the dataset into multiple subsets.
This function divides a dataset into a specified number of subsets, where each subset is intended to have a roughly equal number of samples. This method aims to ensure a homogeneous distribution of data across all subsets. It's particularly useful in scenarios where a uniform distribution of data is desired among all federated learning clients.
The dataset to partition. It should have 'data' and 'targets' attributes.
A dictionary where keys are subset indices (ranging from 0 to partitions_number-1) and values are lists of indices corresponding to the samples in each subset.
The function randomly shuffles the entire dataset and then splits it into the number of subsets specified by partitions_number. It ensures that each subset has a similar number of samples. The function also prints the class distribution in each subset for reference.
partitions_number
federated_data = homo_partition(my_dataset)
def homo_partition(self, dataset):\n \"\"\"\n Homogeneously partition the dataset into multiple subsets.\n\n This function divides a dataset into a specified number of subsets, where each subset\n is intended to have a roughly equal number of samples. This method aims to ensure a\n homogeneous distribution of data across all subsets. It's particularly useful in\n scenarios where a uniform distribution of data is desired among all federated learning\n clients.\n\n Args:\n dataset (torch.utils.data.Dataset): The dataset to partition. It should have\n 'data' and 'targets' attributes.\n\n Returns:\n dict: A dictionary where keys are subset indices (ranging from 0 to partitions_number-1)\n and values are lists of indices corresponding to the samples in each subset.\n\n The function randomly shuffles the entire dataset and then splits it into the number\n of subsets specified by `partitions_number`. It ensures that each subset has a similar number\n of samples. The function also prints the class distribution in each subset for reference.\n\n Example usage:\n federated_data = homo_partition(my_dataset)\n # This creates federated data subsets with homogeneous distribution.\n \"\"\"\n n_nets = self.partitions_number\n\n n_train = len(dataset.targets)\n np.random.seed(self.seed)\n idxs = np.random.permutation(n_train)\n batch_idxs = np.array_split(idxs, n_nets)\n net_dataidx_map = {i: batch_idxs[i] for i in range(n_nets)}\n\n # partitioned_datasets = []\n for i in range(self.partitions_number):\n # subset = torch.utils.data.Subset(dataset, net_dataidx_map[i])\n # partitioned_datasets.append(subset)\n\n # Print class distribution in the current partition\n class_counts = [0] * self.num_classes\n for idx in net_dataidx_map[i]:\n label = dataset.targets[idx]\n class_counts[label] += 1\n logging_training.info(f\"Partition {i + 1} class distribution: {class_counts}\")\n\n return net_dataidx_map\n
initialize_dataset()
Initialize the dataset. This should load or create the dataset.
@abstractmethod\ndef initialize_dataset(self):\n \"\"\"\n Initialize the dataset. This should load or create the dataset.\n \"\"\"\n pass\n
percentage_partition(dataset, percentage=20)
Partition a dataset into multiple subsets with a specified level of non-IID-ness.
This function divides a dataset into a specified number of subsets (federated clients), where each subset has a different class distribution. The class distribution in each subset is determined by a specified percentage, making the partition suitable for simulating non-IID (non-Independently and Identically Distributed) data scenarios in federated learning.
percentage
A value between 0 and 100 that specifies the desired level of non-IID-ness for the labels of the federated data. This percentage controls the imbalance in the class distribution across different subsets.
20
The function ensures that the number of classes in each subset varies based on the selected percentage. The partitioning process involves iterating over each class, shuffling the indices of that class, and then splitting them according to the calculated subset sizes. The function also prints the class distribution in each subset for reference.
federated_data = percentage_partition(my_dataset, percentage=20)
def percentage_partition(self, dataset, percentage=20):\n \"\"\"\n Partition a dataset into multiple subsets with a specified level of non-IID-ness.\n\n This function divides a dataset into a specified number of subsets (federated\n clients), where each subset has a different class distribution. The class\n distribution in each subset is determined by a specified percentage, making the\n partition suitable for simulating non-IID (non-Independently and Identically\n Distributed) data scenarios in federated learning.\n\n Args:\n dataset (torch.utils.data.Dataset): The dataset to partition. It should have\n 'data' and 'targets' attributes.\n percentage (int): A value between 0 and 100 that specifies the desired\n level of non-IID-ness for the labels of the federated data.\n This percentage controls the imbalance in the class distribution\n across different subsets.\n\n Returns:\n dict: A dictionary where keys are subset indices (ranging from 0 to partitions_number-1)\n and values are lists of indices corresponding to the samples in each subset.\n\n The function ensures that the number of classes in each subset varies based on the selected\n percentage. The partitioning process involves iterating over each class, shuffling the\n indices of that class, and then splitting them according to the calculated subset sizes.\n The function also prints the class distribution in each subset for reference.\n\n Example usage:\n federated_data = percentage_partition(my_dataset, percentage=20)\n # This creates federated data subsets with varying class distributions based on\n # a percentage of 20.\n \"\"\"\n if isinstance(dataset.targets, np.ndarray):\n y_train = dataset.targets\n elif hasattr(dataset.targets, \"numpy\"): # Check if it's a tensor with .numpy() method\n y_train = dataset.targets.numpy()\n else: # If it's a list\n y_train = np.asarray(dataset.targets)\n\n num_classes = self.num_classes\n num_subsets = self.partitions_number\n class_indices = {i: np.where(y_train == i)[0] for i in range(num_classes)}\n\n # Get the labels from the dataset\n labels = np.array([dataset.targets[idx] for idx in range(len(dataset))])\n label_counts = np.bincount(labels)\n\n min_label = label_counts.argmin()\n min_count = label_counts[min_label]\n\n classes_per_subset = int(num_classes * percentage / 100)\n if classes_per_subset < 1:\n raise ValueError(\"The percentage is too low to assign at least one class to each subset.\")\n\n subset_indices = [[] for _ in range(num_subsets)]\n class_list = list(range(num_classes))\n np.random.seed(self.seed)\n np.random.shuffle(class_list)\n\n for i in range(num_subsets):\n for j in range(classes_per_subset):\n # Use modulo operation to cycle through the class_list\n class_idx = class_list[(i * classes_per_subset + j) % num_classes]\n indices = class_indices[class_idx]\n np.random.seed(self.seed)\n np.random.shuffle(indices)\n # Select approximately 50% of the indices\n subset_indices[i].extend(indices[: min_count // 2])\n\n class_counts = np.bincount(np.array([dataset.targets[idx] for idx in subset_indices[i]]))\n logging_training.info(f\"Partition {i + 1} class distribution: {class_counts.tolist()}\")\n\n partitioned_datasets = {i: subset_indices[i] for i in range(num_subsets)}\n\n return partitioned_datasets\n
plot_all_data_distribution(dataset, partitions_map)
Plot all of the data distribution of the dataset according to the partitions map provided.
The dataset to plot (torch.utils.data.Dataset).
partitions_map
The map of the dataset partitions.
def plot_all_data_distribution(self, dataset, partitions_map):\n \"\"\"\n\n Plot all of the data distribution of the dataset according to the partitions map provided.\n\n Args:\n dataset: The dataset to plot (torch.utils.data.Dataset).\n partitions_map: The map of the dataset partitions.\n \"\"\"\n sns.set()\n sns.set_style(\"whitegrid\", {\"axes.grid\": False})\n sns.set_context(\"paper\", font_scale=1.5)\n sns.set_palette(\"Set2\")\n\n num_clients = len(partitions_map)\n num_classes = self.num_classes\n\n plt.figure(figsize=(12, 8))\n\n label_distribution = [[] for _ in range(num_classes)]\n for c_id, idc in partitions_map.items():\n for idx in idc:\n label_distribution[dataset.targets[idx]].append(c_id)\n\n plt.hist(\n label_distribution,\n stacked=True,\n bins=np.arange(-0.5, num_clients + 1.5, 1),\n label=dataset.classes,\n rwidth=0.5,\n )\n plt.xticks(\n np.arange(num_clients),\n [\"Participant %d\" % (c_id + 1) for c_id in range(num_clients)],\n )\n plt.title(\"Distribution of splited datasets\")\n plt.xlabel(\"Participant\")\n plt.ylabel(\"Number of samples\")\n plt.xticks(range(num_clients), [f\" {i}\" for i in range(num_clients)])\n plt.legend(loc=\"upper right\")\n plt.tight_layout()\n\n path_to_save = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/all_data_distribution_{'iid' if self.iid else 'non_iid'}{'_' + self.partition if not self.iid else ''}.png\"\n plt.savefig(path_to_save, dpi=300, bbox_inches=\"tight\")\n plt.close()\n
plot_data_distribution(dataset, partitions_map)
Plot the data distribution of the dataset.
Plot the data distribution of the dataset according to the partitions map provided.
def plot_data_distribution(self, dataset, partitions_map):\n \"\"\"\n Plot the data distribution of the dataset.\n\n Plot the data distribution of the dataset according to the partitions map provided.\n\n Args:\n dataset: The dataset to plot (torch.utils.data.Dataset).\n partitions_map: The map of the dataset partitions.\n \"\"\"\n # Plot the data distribution of the dataset, one graph per partition\n sns.set()\n sns.set_style(\"whitegrid\", {\"axes.grid\": False})\n sns.set_context(\"paper\", font_scale=1.5)\n sns.set_palette(\"Set2\")\n\n for i in range(self.partitions_number):\n indices = partitions_map[i]\n class_counts = [0] * self.num_classes\n for idx in indices:\n label = dataset.targets[idx]\n class_counts[label] += 1\n logging_training.info(f\"Participant {i + 1} class distribution: {class_counts}\")\n plt.figure()\n plt.bar(range(self.num_classes), class_counts)\n plt.xlabel(\"Class\")\n plt.ylabel(\"Number of samples\")\n plt.xticks(range(self.num_classes))\n if self.iid:\n plt.title(f\"Participant {i + 1} class distribution (IID)\")\n else:\n plt.title(\n f\"Participant {i + 1} class distribution (Non-IID - {self.partition}) - {self.partition_parameter}\"\n )\n plt.tight_layout()\n path_to_save = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/participant_{i}_class_distribution_{'iid' if self.iid else 'non_iid'}{'_' + self.partition if not self.iid else ''}.png\"\n plt.savefig(path_to_save, dpi=300, bbox_inches=\"tight\")\n plt.close()\n\n plt.figure()\n max_point_size = 500\n min_point_size = 0\n\n for i in range(self.partitions_number):\n class_counts = [0] * self.num_classes\n indices = partitions_map[i]\n for idx in indices:\n label = dataset.targets[idx]\n class_counts[label] += 1\n\n # Normalize the point sizes for this partition\n max_samples_partition = max(class_counts)\n sizes = [\n (size / max_samples_partition) * (max_point_size - min_point_size) + min_point_size\n for size in class_counts\n ]\n plt.scatter([i] * self.num_classes, range(self.num_classes), s=sizes, alpha=0.5)\n\n plt.xlabel(\"Participant\")\n plt.ylabel(\"Class\")\n plt.xticks(range(self.partitions_number))\n plt.yticks(range(self.num_classes))\n if self.iid:\n plt.title(f\"Participant {i + 1} class distribution (IID)\")\n else:\n plt.title(\n f\"Participant {i + 1} class distribution (Non-IID - {self.partition}) - {self.partition_parameter}\"\n )\n plt.tight_layout()\n\n # Saves the distribution display with circles of different size\n path_to_save = f\"{self.config.participant['tracking_args']['log_dir']}/{self.config.participant['scenario_args']['name']}/class_distribution_{'iid' if self.iid else 'non_iid'}{'_' + self.partition if not self.iid else ''}.png\"\n plt.savefig(path_to_save, dpi=300, bbox_inches=\"tight\")\n plt.close()\n\n if hasattr(self, \"tsne\") and self.tsne:\n self.visualize_tsne(dataset)\n
unbalanced_iid_partition(dataset, imbalance_factor=2)
Partition the dataset into multiple IID (Independent and Identically Distributed) subsets with different size.
This function divides a dataset into a specified number of IID subsets (federated clients), where each subset has a different number of samples. The number of samples in each subset is determined by an imbalance factor, making the partition suitable for simulating imbalanced data scenarios in federated learning.
imbalance_factor
The factor to determine the degree of imbalance among the subsets. A lower imbalance factor leads to more imbalanced partitions.
2
The function ensures that each class is represented in each subset but with varying proportions. The partitioning process involves iterating over each class, shuffling the indices of that class, and then splitting them according to the calculated subset sizes. The function does not print the class distribution in each subset.
federated_data = unbalanced_iid_partition(my_dataset, imbalance_factor=2)
def unbalanced_iid_partition(self, dataset, imbalance_factor=2):\n \"\"\"\n Partition the dataset into multiple IID (Independent and Identically Distributed)\n subsets with different size.\n\n This function divides a dataset into a specified number of IID subsets (federated\n clients), where each subset has a different number of samples. The number of samples\n in each subset is determined by an imbalance factor, making the partition suitable\n for simulating imbalanced data scenarios in federated learning.\n\n Args:\n dataset (list): The dataset to partition. It should be a list of tuples where\n each tuple represents a data sample and its corresponding label.\n imbalance_factor (float): The factor to determine the degree of imbalance\n among the subsets. A lower imbalance factor leads to more\n imbalanced partitions.\n\n Returns:\n dict: A dictionary where keys are client IDs (ranging from 0 to partitions_number-1) and\n values are lists of indices corresponding to the samples assigned to each client.\n\n The function ensures that each class is represented in each subset but with varying\n proportions. The partitioning process involves iterating over each class, shuffling\n the indices of that class, and then splitting them according to the calculated subset\n sizes. The function does not print the class distribution in each subset.\n\n Example usage:\n federated_data = unbalanced_iid_partition(my_dataset, imbalance_factor=2)\n # This creates federated data subsets with varying number of samples based on\n # an imbalance factor of 2.\n \"\"\"\n num_clients = self.partitions_number\n clients_data = {i: [] for i in range(num_clients)}\n\n # Get the labels from the dataset\n labels = np.array([dataset.targets[idx] for idx in range(len(dataset))])\n label_counts = np.bincount(labels)\n\n min_label = label_counts.argmin()\n min_count = label_counts[min_label]\n\n # Set the initial_subset_size\n initial_subset_size = min_count // num_clients\n\n # Calculate the number of samples for each subset based on the imbalance factor\n subset_sizes = [initial_subset_size]\n for i in range(1, num_clients):\n subset_sizes.append(int(subset_sizes[i - 1] * ((imbalance_factor - 1) / imbalance_factor)))\n\n for label in range(self.num_classes):\n # Get the indices of the same label samples\n label_indices = np.where(labels == label)[0]\n np.random.seed(self.seed)\n np.random.shuffle(label_indices)\n\n # Split the data based on their labels\n start = 0\n for i in range(num_clients):\n end = start + subset_sizes[i]\n clients_data[i].extend(label_indices[start:end])\n start = end\n\n return clients_data\n
NebulaModel
Bases: LightningModule, ABC
LightningModule
Abstract class for the NEBULA model.
This class is an abstract class that defines the interface for the NEBULA model.
nebula/core/models/nebulamodel.py
class NebulaModel(pl.LightningModule, ABC):\n \"\"\"\n Abstract class for the NEBULA model.\n\n This class is an abstract class that defines the interface for the NEBULA model.\n \"\"\"\n\n def process_metrics(self, phase, y_pred, y, loss=None):\n \"\"\"\n Calculate and log metrics for the given phase.\n The metrics are calculated in each batch.\n Args:\n phase (str): One of 'Train', 'Validation', or 'Test'\n y_pred (torch.Tensor): Model predictions\n y (torch.Tensor): Ground truth labels\n loss (torch.Tensor, optional): Loss value\n \"\"\"\n\n y_pred_classes = torch.argmax(y_pred, dim=1).detach()\n y = y.detach()\n if phase == \"Train\":\n self.logger.log_data({f\"{phase}/Loss\": loss.detach()})\n self.train_metrics.update(y_pred_classes, y)\n elif phase == \"Validation\":\n self.val_metrics.update(y_pred_classes, y)\n elif phase == \"Test (Local)\":\n self.test_metrics.update(y_pred_classes, y)\n self.cm.update(y_pred_classes, y) if self.cm is not None else None\n elif phase == \"Test (Global)\":\n self.test_metrics_global.update(y_pred_classes, y)\n self.cm_global.update(y_pred_classes, y) if self.cm_global is not None else None\n else:\n raise NotImplementedError\n\n del y_pred_classes, y\n\n def log_metrics_end(self, phase):\n \"\"\"\n Log metrics for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', 'Test (Local)', or 'Test (Global)'\n print_cm (bool): Print confusion matrix\n plot_cm (bool): Plot confusion matrix\n \"\"\"\n if phase == \"Train\":\n output = self.train_metrics.compute()\n elif phase == \"Validation\":\n output = self.val_metrics.compute()\n elif phase == \"Test (Local)\":\n output = self.test_metrics.compute()\n elif phase == \"Test (Global)\":\n output = self.test_metrics_global.compute()\n else:\n raise NotImplementedError\n\n output = {\n f\"{phase}/{key.replace('Multiclass', '').split('/')[-1]}\": value.detach() for key, value in output.items()\n }\n\n self.logger.log_data(output, step=self.global_number[phase])\n\n metrics_str = \"\"\n for key, value in output.items():\n metrics_str += f\"{key}: {value:.4f}\\n\"\n print_msg_box(\n metrics_str,\n indent=2,\n title=f\"{phase} Metrics | Epoch: {self.global_number[phase]} | Round: {self.round}\",\n logger_name=TRAINING_LOGGER,\n )\n\n def generate_confusion_matrix(self, phase, print_cm=False, plot_cm=False):\n \"\"\"\n Generate and plot the confusion matrix for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', 'Test (Local)', or 'Test (Global)'\n \"\"\"\n if phase == \"Test (Local)\":\n if self.cm is None:\n raise ValueError(f\"Confusion matrix not available for {phase} phase.\")\n cm = self.cm.compute().cpu()\n elif phase == \"Test (Global)\":\n if self.cm_global is None:\n raise ValueError(f\"Confusion matrix not available for {phase} phase.\")\n cm = self.cm_global.compute().cpu()\n else:\n raise NotImplementedError\n\n if print_cm:\n logging_training.info(f\"{phase} / Confusion Matrix:\\n{cm}\")\n\n if plot_cm:\n cm_numpy = cm.numpy().astype(int)\n classes = [i for i in range(self.num_classes)]\n fig, ax = plt.subplots(figsize=(12, 12))\n sns.heatmap(\n cm_numpy,\n annot=False,\n fmt=\"\",\n cmap=\"Blues\",\n ax=ax,\n xticklabels=classes,\n yticklabels=classes,\n square=True,\n )\n ax.set_xlabel(\"Predicted labels\", fontsize=12)\n ax.set_ylabel(\"True labels\", fontsize=12)\n ax.set_title(f\"{phase} Confusion Matrix\", fontsize=16)\n plt.xticks(rotation=90, fontsize=6)\n plt.yticks(rotation=0, fontsize=6)\n plt.tight_layout()\n self.logger.log_figure(fig, step=self.round, name=f\"{phase}/CM\")\n plt.close()\n\n del cm_numpy, classes, fig, ax\n\n # Restablecer la matriz de confusi\u00f3n\n if phase == \"Test (Local)\":\n self.cm.reset()\n else:\n self.cm_global.reset()\n\n del cm\n\n def __init__(\n self,\n input_channels=1,\n num_classes=10,\n learning_rate=1e-3,\n metrics=None,\n confusion_matrix=None,\n seed=None,\n ):\n super().__init__()\n\n self.input_channels = input_channels\n self.num_classes = num_classes\n self.learning_rate = learning_rate\n\n if metrics is None:\n metrics = MetricCollection([\n MulticlassAccuracy(num_classes=num_classes),\n MulticlassPrecision(num_classes=num_classes),\n MulticlassRecall(num_classes=num_classes),\n MulticlassF1Score(num_classes=num_classes),\n ])\n self.train_metrics = metrics.clone(prefix=\"Train/\")\n self.val_metrics = metrics.clone(prefix=\"Validation/\")\n self.test_metrics = metrics.clone(prefix=\"Test (Local)/\")\n self.test_metrics_global = metrics.clone(prefix=\"Test (Global)/\")\n del metrics\n if confusion_matrix is None:\n self.cm = MulticlassConfusionMatrix(num_classes=num_classes)\n self.cm_global = MulticlassConfusionMatrix(num_classes=num_classes)\n if seed is not None:\n torch.manual_seed(seed)\n torch.cuda.manual_seed_all(seed)\n\n # Round counter (number of training-validation-test rounds)\n self.round = 0\n\n # Epochs counter\n self.global_number = {\n \"Train\": 0,\n \"Validation\": 0,\n \"Test (Local)\": 0,\n \"Test (Global)\": 0,\n }\n\n # Communication manager for sending messages from the model (e.g., prototypes, gradients)\n # Model parameters are sent by default using network.propagator\n self.communication_manager = None\n\n def set_communication_manager(self, communication_manager):\n self.communication_manager = communication_manager\n\n def get_communication_manager(self):\n if self.communication_manager is None:\n raise ValueError(\"Communication manager not set.\")\n return self.communication_manager\n\n @abstractmethod\n def forward(self, x):\n \"\"\"Forward pass of the model.\"\"\"\n pass\n\n @abstractmethod\n def configure_optimizers(self):\n \"\"\"Optimizer configuration.\"\"\"\n pass\n\n def step(self, batch, batch_idx, phase):\n \"\"\"Training/validation/test step.\"\"\"\n x, y = batch\n y_pred = self.forward(x)\n loss = self.criterion(y_pred, y)\n self.process_metrics(phase, y_pred, y, loss)\n\n return loss\n\n def training_step(self, batch, batch_idx):\n \"\"\"\n Training step for the model.\n Args:\n batch:\n batch_id:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx=batch_idx, phase=\"Train\")\n\n def on_train_start(self):\n logging_training.info(f\"{'=' * 10} [Training] Started {'=' * 10}\")\n\n def on_train_end(self):\n logging_training.info(f\"{'=' * 10} [Training] Done {'=' * 10}\")\n\n def on_train_epoch_end(self):\n self.log_metrics_end(\"Train\")\n self.train_metrics.reset()\n self.global_number[\"Train\"] += 1\n\n def validation_step(self, batch, batch_idx):\n \"\"\"\n Validation step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx=batch_idx, phase=\"Validation\")\n\n def on_validation_end(self):\n pass\n\n def on_validation_epoch_end(self):\n # In general, the validation phase is done in one epoch\n self.log_metrics_end(\"Validation\")\n self.val_metrics.reset()\n self.global_number[\"Validation\"] += 1\n\n def test_step(self, batch, batch_idx, dataloader_idx=None):\n \"\"\"\n Test step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n if dataloader_idx == 0:\n return self.step(batch, batch_idx=batch_idx, phase=\"Test (Local)\")\n else:\n return self.step(batch, batch_idx=batch_idx, phase=\"Test (Global)\")\n\n def on_test_start(self):\n logging_training.info(f\"{'=' * 10} [Testing] Started {'=' * 10}\")\n\n def on_test_end(self):\n logging_training.info(f\"{'=' * 10} [Testing] Done {'=' * 10}\")\n\n def on_test_epoch_end(self):\n # In general, the test phase is done in one epoch\n self.log_metrics_end(\"Test (Local)\")\n self.log_metrics_end(\"Test (Global)\")\n self.generate_confusion_matrix(\"Test (Local)\", print_cm=True, plot_cm=True)\n self.generate_confusion_matrix(\"Test (Global)\", print_cm=True, plot_cm=True)\n self.test_metrics.reset()\n self.test_metrics_global.reset()\n self.global_number[\"Test (Local)\"] += 1\n self.global_number[\"Test (Global)\"] += 1\n\n def on_round_end(self):\n self.round += 1\n
configure_optimizers()
Optimizer configuration.
@abstractmethod\ndef configure_optimizers(self):\n \"\"\"Optimizer configuration.\"\"\"\n pass\n
forward(x)
Forward pass of the model.
@abstractmethod\ndef forward(self, x):\n \"\"\"Forward pass of the model.\"\"\"\n pass\n
generate_confusion_matrix(phase, print_cm=False, plot_cm=False)
Generate and plot the confusion matrix for the given phase. Args: phase (str): One of 'Train', 'Validation', 'Test (Local)', or 'Test (Global)'
def generate_confusion_matrix(self, phase, print_cm=False, plot_cm=False):\n \"\"\"\n Generate and plot the confusion matrix for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', 'Test (Local)', or 'Test (Global)'\n \"\"\"\n if phase == \"Test (Local)\":\n if self.cm is None:\n raise ValueError(f\"Confusion matrix not available for {phase} phase.\")\n cm = self.cm.compute().cpu()\n elif phase == \"Test (Global)\":\n if self.cm_global is None:\n raise ValueError(f\"Confusion matrix not available for {phase} phase.\")\n cm = self.cm_global.compute().cpu()\n else:\n raise NotImplementedError\n\n if print_cm:\n logging_training.info(f\"{phase} / Confusion Matrix:\\n{cm}\")\n\n if plot_cm:\n cm_numpy = cm.numpy().astype(int)\n classes = [i for i in range(self.num_classes)]\n fig, ax = plt.subplots(figsize=(12, 12))\n sns.heatmap(\n cm_numpy,\n annot=False,\n fmt=\"\",\n cmap=\"Blues\",\n ax=ax,\n xticklabels=classes,\n yticklabels=classes,\n square=True,\n )\n ax.set_xlabel(\"Predicted labels\", fontsize=12)\n ax.set_ylabel(\"True labels\", fontsize=12)\n ax.set_title(f\"{phase} Confusion Matrix\", fontsize=16)\n plt.xticks(rotation=90, fontsize=6)\n plt.yticks(rotation=0, fontsize=6)\n plt.tight_layout()\n self.logger.log_figure(fig, step=self.round, name=f\"{phase}/CM\")\n plt.close()\n\n del cm_numpy, classes, fig, ax\n\n # Restablecer la matriz de confusi\u00f3n\n if phase == \"Test (Local)\":\n self.cm.reset()\n else:\n self.cm_global.reset()\n\n del cm\n
log_metrics_end(phase)
Log metrics for the given phase. Args: phase (str): One of 'Train', 'Validation', 'Test (Local)', or 'Test (Global)' print_cm (bool): Print confusion matrix plot_cm (bool): Plot confusion matrix
def log_metrics_end(self, phase):\n \"\"\"\n Log metrics for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', 'Test (Local)', or 'Test (Global)'\n print_cm (bool): Print confusion matrix\n plot_cm (bool): Plot confusion matrix\n \"\"\"\n if phase == \"Train\":\n output = self.train_metrics.compute()\n elif phase == \"Validation\":\n output = self.val_metrics.compute()\n elif phase == \"Test (Local)\":\n output = self.test_metrics.compute()\n elif phase == \"Test (Global)\":\n output = self.test_metrics_global.compute()\n else:\n raise NotImplementedError\n\n output = {\n f\"{phase}/{key.replace('Multiclass', '').split('/')[-1]}\": value.detach() for key, value in output.items()\n }\n\n self.logger.log_data(output, step=self.global_number[phase])\n\n metrics_str = \"\"\n for key, value in output.items():\n metrics_str += f\"{key}: {value:.4f}\\n\"\n print_msg_box(\n metrics_str,\n indent=2,\n title=f\"{phase} Metrics | Epoch: {self.global_number[phase]} | Round: {self.round}\",\n logger_name=TRAINING_LOGGER,\n )\n
process_metrics(phase, y_pred, y, loss=None)
Calculate and log metrics for the given phase. The metrics are calculated in each batch. Args: phase (str): One of 'Train', 'Validation', or 'Test' y_pred (torch.Tensor): Model predictions y (torch.Tensor): Ground truth labels loss (torch.Tensor, optional): Loss value
def process_metrics(self, phase, y_pred, y, loss=None):\n \"\"\"\n Calculate and log metrics for the given phase.\n The metrics are calculated in each batch.\n Args:\n phase (str): One of 'Train', 'Validation', or 'Test'\n y_pred (torch.Tensor): Model predictions\n y (torch.Tensor): Ground truth labels\n loss (torch.Tensor, optional): Loss value\n \"\"\"\n\n y_pred_classes = torch.argmax(y_pred, dim=1).detach()\n y = y.detach()\n if phase == \"Train\":\n self.logger.log_data({f\"{phase}/Loss\": loss.detach()})\n self.train_metrics.update(y_pred_classes, y)\n elif phase == \"Validation\":\n self.val_metrics.update(y_pred_classes, y)\n elif phase == \"Test (Local)\":\n self.test_metrics.update(y_pred_classes, y)\n self.cm.update(y_pred_classes, y) if self.cm is not None else None\n elif phase == \"Test (Global)\":\n self.test_metrics_global.update(y_pred_classes, y)\n self.cm_global.update(y_pred_classes, y) if self.cm_global is not None else None\n else:\n raise NotImplementedError\n\n del y_pred_classes, y\n
step(batch, batch_idx, phase)
Training/validation/test step.
def step(self, batch, batch_idx, phase):\n \"\"\"Training/validation/test step.\"\"\"\n x, y = batch\n y_pred = self.forward(x)\n loss = self.criterion(y_pred, y)\n self.process_metrics(phase, y_pred, y, loss)\n\n return loss\n
test_step(batch, batch_idx, dataloader_idx=None)
Test step for the model. Args: batch: batch_idx:
def test_step(self, batch, batch_idx, dataloader_idx=None):\n \"\"\"\n Test step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n if dataloader_idx == 0:\n return self.step(batch, batch_idx=batch_idx, phase=\"Test (Local)\")\n else:\n return self.step(batch, batch_idx=batch_idx, phase=\"Test (Global)\")\n
training_step(batch, batch_idx)
Training step for the model. Args: batch: batch_id:
def training_step(self, batch, batch_idx):\n \"\"\"\n Training step for the model.\n Args:\n batch:\n batch_id:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx=batch_idx, phase=\"Train\")\n
validation_step(batch, batch_idx)
Validation step for the model. Args: batch: batch_idx:
def validation_step(self, batch, batch_idx):\n \"\"\"\n Validation step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx=batch_idx, phase=\"Validation\")\n
ContrastiveLoss
Bases: Module
Module
Contrastive loss function.
nebula/core/models/cifar10/dualagg.py
class ContrastiveLoss(torch.nn.Module):\n \"\"\"\n Contrastive loss function.\n \"\"\"\n\n def __init__(self, mu=0.5):\n super().__init__()\n self.mu = mu\n self.cross_entropy_loss = torch.nn.CrossEntropyLoss()\n\n def forward(self, local_out, global_out, historical_out, labels):\n \"\"\"\n Calculates the contrastive loss between the local output, global output, and historical output.\n\n Args:\n local_out (torch.Tensor): The local output tensor of shape (batch_size, embedding_size).\n global_out (torch.Tensor): The global output tensor of shape (batch_size, embedding_size).\n historical_out (torch.Tensor): The historical output tensor of shape (batch_size, embedding_size).\n labels (torch.Tensor): The ground truth labels tensor of shape (batch_size,).\n\n Returns:\n torch.Tensor: The contrastive loss value.\n\n Raises:\n ValueError: If the input tensors have different shapes.\n\n Notes:\n - The contrastive loss is calculated as the difference between the mean cosine similarity of the local output\n with the historical output and the mean cosine similarity of the local output with the global output,\n multiplied by a scaling factor mu.\n - The cosine similarity values represent the similarity between the corresponding vectors in the input tensors.\n Higher values indicate greater similarity, while lower values indicate less similarity.\n \"\"\"\n if local_out.shape != global_out.shape or local_out.shape != historical_out.shape:\n raise ValueError(\"Input tensors must have the same shape.\")\n\n # Cross-entropy loss\n ce_loss = self.cross_entropy_loss(local_out, labels)\n # if round > 1:\n # Positive cosine similarity\n pos_cos_sim = F.cosine_similarity(local_out, historical_out, dim=1).mean()\n\n # Negative cosine similarity\n neg_cos_sim = -F.cosine_similarity(local_out, global_out, dim=1).mean()\n\n # Combined loss\n contrastive_loss = ce_loss + self.mu * 0.5 * (pos_cos_sim + neg_cos_sim)\n\n logging_training.debug(\n f\"Contrastive loss (mu={self.mu}) with 0.5 of factor: ce_loss: {ce_loss}, pos_cos_sim_local_historical: {pos_cos_sim}, neg_cos_sim_local_global: {neg_cos_sim}, loss: {contrastive_loss}\"\n )\n return contrastive_loss\n
forward(local_out, global_out, historical_out, labels)
Calculates the contrastive loss between the local output, global output, and historical output.
local_out
Tensor
The local output tensor of shape (batch_size, embedding_size).
global_out
The global output tensor of shape (batch_size, embedding_size).
historical_out
The historical output tensor of shape (batch_size, embedding_size).
labels
The ground truth labels tensor of shape (batch_size,).
torch.Tensor: The contrastive loss value.
Raises:
ValueError
If the input tensors have different shapes.
def forward(self, local_out, global_out, historical_out, labels):\n \"\"\"\n Calculates the contrastive loss between the local output, global output, and historical output.\n\n Args:\n local_out (torch.Tensor): The local output tensor of shape (batch_size, embedding_size).\n global_out (torch.Tensor): The global output tensor of shape (batch_size, embedding_size).\n historical_out (torch.Tensor): The historical output tensor of shape (batch_size, embedding_size).\n labels (torch.Tensor): The ground truth labels tensor of shape (batch_size,).\n\n Returns:\n torch.Tensor: The contrastive loss value.\n\n Raises:\n ValueError: If the input tensors have different shapes.\n\n Notes:\n - The contrastive loss is calculated as the difference between the mean cosine similarity of the local output\n with the historical output and the mean cosine similarity of the local output with the global output,\n multiplied by a scaling factor mu.\n - The cosine similarity values represent the similarity between the corresponding vectors in the input tensors.\n Higher values indicate greater similarity, while lower values indicate less similarity.\n \"\"\"\n if local_out.shape != global_out.shape or local_out.shape != historical_out.shape:\n raise ValueError(\"Input tensors must have the same shape.\")\n\n # Cross-entropy loss\n ce_loss = self.cross_entropy_loss(local_out, labels)\n # if round > 1:\n # Positive cosine similarity\n pos_cos_sim = F.cosine_similarity(local_out, historical_out, dim=1).mean()\n\n # Negative cosine similarity\n neg_cos_sim = -F.cosine_similarity(local_out, global_out, dim=1).mean()\n\n # Combined loss\n contrastive_loss = ce_loss + self.mu * 0.5 * (pos_cos_sim + neg_cos_sim)\n\n logging_training.debug(\n f\"Contrastive loss (mu={self.mu}) with 0.5 of factor: ce_loss: {ce_loss}, pos_cos_sim_local_historical: {pos_cos_sim}, neg_cos_sim_local_global: {neg_cos_sim}, loss: {contrastive_loss}\"\n )\n return contrastive_loss\n
DualAggModel
Bases: LightningModule
class DualAggModel(pl.LightningModule):\n def process_metrics(self, phase, y_pred, y, loss=None, mode=\"local\"):\n \"\"\"\n Calculate and log metrics for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', or 'Test'\n y_pred (torch.Tensor): Model predictions\n y (torch.Tensor): Ground truth labels\n loss (torch.Tensor, optional): Loss value\n \"\"\"\n\n y_pred_classes = torch.argmax(y_pred, dim=1)\n if phase == \"Train\":\n if mode == \"local\":\n output = self.local_train_metrics(y_pred_classes, y)\n elif mode == \"historical\":\n output = self.historical_train_metrics(y_pred_classes, y)\n elif mode == \"global\":\n output = self.global_train_metrics(y_pred_classes, y)\n elif phase == \"Validation\":\n if mode == \"local\":\n output = self.local_val_metrics(y_pred_classes, y)\n elif mode == \"historical\":\n output = self.historical_val_metrics(y_pred_classes, y)\n elif mode == \"global\":\n output = self.global_val_metrics(y_pred_classes, y)\n elif phase == \"Test\":\n if mode == \"local\":\n output = self.local_test_metrics(y_pred_classes, y)\n elif mode == \"historical\":\n output = self.historical_test_metrics(y_pred_classes, y)\n elif mode == \"global\":\n output = self.global_test_metrics(y_pred_classes, y)\n else:\n raise NotImplementedError\n # print(f\"y_pred shape: {y_pred.shape}, y_pred_classes shape: {y_pred_classes.shape}, y shape: {y.shape}\") # Debug print\n output = {\n f\"{mode}/{phase}/{key.replace('Multiclass', '').split('/')[-1]}\": value for key, value in output.items()\n }\n self.log_dict(output, prog_bar=True, logger=True)\n\n if self.local_cm is not None and self.historical_cm is not None and self.global_cm is not None:\n if mode == \"local\":\n self.local_cm.update(y_pred_classes, y)\n elif mode == \"historical\":\n self.historical_cm.update(y_pred_classes, y)\n elif mode == \"global\":\n self.global_cm.update(y_pred_classes, y)\n\n def log_metrics_by_epoch(self, phase, print_cm=False, plot_cm=False, mode=\"local\"):\n \"\"\"\n Log all metrics at the end of an epoch for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', or 'Test'\n :param phase:\n :param plot_cm:\n \"\"\"\n if mode == \"local\":\n print(f\"Epoch end: {mode} {phase}, epoch number: {self.local_epoch_global_number[phase]}\")\n elif mode == \"historical\":\n print(f\"Epoch end: {mode} {phase}, epoch number: {self.historical_epoch_global_number[phase]}\")\n elif mode == \"global\":\n print(f\"Epoch end: {mode} {phase}, epoch number: {self.global_epoch_global_number[phase]}\")\n\n if phase == \"Train\":\n if mode == \"local\":\n output = self.local_train_metrics.compute()\n self.local_train_metrics.reset()\n elif mode == \"historical\":\n output = self.historical_train_metrics.compute()\n self.historical_train_metrics.reset()\n elif mode == \"global\":\n output = self.global_train_metrics.compute()\n self.global_train_metrics.reset()\n elif phase == \"Validation\":\n if mode == \"local\":\n output = self.local_val_metrics.compute()\n self.local_val_metrics.reset()\n elif mode == \"historical\":\n output = self.historical_val_metrics.compute()\n self.historical_val_metrics.reset()\n elif mode == \"global\":\n output = self.global_val_metrics.compute()\n self.global_val_metrics.reset()\n elif phase == \"Test\":\n if mode == \"local\":\n output = self.local_test_metrics.compute()\n self.local_test_metrics.reset()\n elif mode == \"historical\":\n output = self.historical_test_metrics.compute()\n self.historical_test_metrics.reset()\n elif mode == \"global\":\n output = self.global_test_metrics.compute()\n self.global_test_metrics.reset()\n else:\n raise NotImplementedError\n\n output = {\n f\"{mode}/{phase}Epoch/{key.replace('Multiclass', '').split('/')[-1]}\": value\n for key, value in output.items()\n }\n\n self.log_dict(output, prog_bar=True, logger=True)\n\n if self.local_cm is not None and self.historical_cm is not None and self.global_cm is not None:\n if mode == \"local\":\n cm = self.local_cm.compute().cpu()\n elif mode == \"historical\":\n cm = self.historical_cm.compute().cpu()\n elif mode == \"global\":\n cm = self.global_cm.compute().cpu()\n print(f\"{mode}/{phase}Epoch/CM\\n\", cm) if print_cm else None\n if plot_cm:\n plt.figure(figsize=(10, 7))\n ax = sns.heatmap(cm.numpy(), annot=True, fmt=\"d\", cmap=\"Blues\")\n ax.set_xlabel(\"Predicted labels\")\n ax.set_ylabel(\"True labels\")\n ax.set_title(\"Confusion Matrix\")\n ax.set_xticks(range(self.num_classes))\n ax.set_yticks(range(self.num_classes))\n ax.xaxis.set_ticklabels([i for i in range(self.num_classes)])\n ax.yaxis.set_ticklabels([i for i in range(self.num_classes)])\n if mode == \"local\":\n self.logger.experiment.add_figure(\n f\"{mode}/{phase}Epoch/CM\",\n ax.get_figure(),\n global_step=self.local_epoch_global_number[phase],\n )\n elif mode == \"historical\":\n self.logger.experiment.add_figure(\n f\"{mode}/{phase}Epoch/CM\",\n ax.get_figure(),\n global_step=self.historical_epoch_global_number[phase],\n )\n elif mode == \"global\":\n self.logger.experiment.add_figure(\n f\"{mode}/{phase}Epoch/CM\",\n ax.get_figure(),\n global_step=self.global_epoch_global_number[phase],\n )\n plt.close()\n\n if mode == \"local\":\n self.local_epoch_global_number[phase] += 1\n elif mode == \"historical\":\n self.historical_epoch_global_number[phase] += 1\n elif mode == \"global\":\n self.global_epoch_global_number[phase] += 1\n\n def __init__(\n self,\n input_channels=3,\n num_classes=10,\n learning_rate=1e-3,\n mu=0.5,\n metrics=None,\n confusion_matrix=None,\n seed=None,\n ):\n super().__init__()\n\n self.input_channels = input_channels\n self.num_classes = num_classes\n self.learning_rate = learning_rate\n self.mu = mu\n\n if metrics is None:\n metrics = MetricCollection([\n MulticlassAccuracy(num_classes=num_classes),\n MulticlassPrecision(num_classes=num_classes),\n MulticlassRecall(num_classes=num_classes),\n MulticlassF1Score(num_classes=num_classes),\n ])\n\n # Define metrics\n self.local_train_metrics = metrics.clone(prefix=\"Local/Train/\")\n self.local_val_metrics = metrics.clone(prefix=\"Local/Validation/\")\n self.local_test_metrics = metrics.clone(prefix=\"Local/Test/\")\n\n self.historical_train_metrics = metrics.clone(prefix=\"Historical/Train/\")\n self.historical_val_metrics = metrics.clone(prefix=\"Historical/Validation/\")\n self.historical_test_metrics = metrics.clone(prefix=\"Historical/Test/\")\n\n self.global_train_metrics = metrics.clone(prefix=\"Global/Train/\")\n self.global_val_metrics = metrics.clone(prefix=\"Global/Validation/\")\n self.global_test_metrics = metrics.clone(prefix=\"Global/Test/\")\n\n if confusion_matrix is None:\n self.local_cm = MulticlassConfusionMatrix(num_classes=num_classes)\n self.historical_cm = MulticlassConfusionMatrix(num_classes=num_classes)\n self.global_cm = MulticlassConfusionMatrix(num_classes=num_classes)\n\n # Set seed for reproducibility initialization\n if seed is not None:\n torch.manual_seed(seed)\n torch.cuda.manual_seed_all(seed)\n\n self.local_epoch_global_number = {\"Train\": 0, \"Validation\": 0, \"Test\": 0}\n self.historical_epoch_global_number = {\"Train\": 0, \"Validation\": 0, \"Test\": 0}\n self.global_epoch_global_number = {\"Train\": 0, \"Validation\": 0, \"Test\": 0}\n\n self.config = {\"beta1\": 0.851436, \"beta2\": 0.999689, \"amsgrad\": True}\n\n self.example_input_array = torch.rand(1, 3, 32, 32)\n self.learning_rate = learning_rate\n self.criterion = ContrastiveLoss(mu=self.mu)\n\n # Define layers of the model\n self.model = torch.nn.Sequential(\n torch.nn.Conv2d(input_channels, 16, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Conv2d(16, 32, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Conv2d(32, 64, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Flatten(),\n torch.nn.Linear(64 * 4 * 4, 512),\n torch.nn.ReLU(),\n torch.nn.Linear(512, num_classes),\n )\n\n # Siamese models of the model above\n self.historical_model = torch.nn.Sequential(\n torch.nn.Conv2d(input_channels, 16, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Conv2d(16, 32, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Conv2d(32, 64, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Flatten(),\n torch.nn.Linear(64 * 4 * 4, 512),\n torch.nn.ReLU(),\n torch.nn.Linear(512, num_classes),\n )\n self.global_model = torch.nn.Sequential(\n torch.nn.Conv2d(input_channels, 16, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Conv2d(16, 32, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Conv2d(32, 64, 3, padding=1),\n torch.nn.ReLU(),\n torch.nn.MaxPool2d(2, 2),\n torch.nn.Flatten(),\n torch.nn.Linear(64 * 4 * 4, 512),\n torch.nn.ReLU(),\n torch.nn.Linear(512, num_classes),\n )\n # self.historical_model = copy.deepcopy(self.model)\n # self.global_model = copy.deepcopy(self.model)\n\n def forward(self, x, mode=\"local\"):\n \"\"\"Forward pass of the model.\"\"\"\n if mode == \"local\":\n return self.model(x)\n elif mode == \"global\":\n return self.global_model(x)\n elif mode == \"historical\":\n return self.historical_model(x)\n else:\n raise NotImplementedError\n\n def configure_optimizers(self):\n \"\"\" \"\"\"\n optimizer = torch.optim.Adam(\n self.parameters(),\n lr=self.learning_rate,\n betas=(self.config[\"beta1\"], self.config[\"beta2\"]),\n amsgrad=self.config[\"amsgrad\"],\n )\n return optimizer\n\n def step(self, batch, batch_idx, phase):\n images, labels = batch\n images = images.to(self.device)\n labels = labels.to(self.device)\n local_out = self.forward(images, mode=\"local\")\n with torch.no_grad():\n historical_out = self.forward(images, mode=\"historical\")\n global_out = self.forward(images, mode=\"global\")\n\n loss = self.criterion(local_out, global_out, historical_out, labels)\n\n # Get metrics for each batch and log them\n self.log(f\"{phase}/ConstrastiveLoss\", loss, prog_bar=True, logger=True) # Constrastive loss\n self.process_metrics(phase, local_out, labels, loss, mode=\"local\")\n self.process_metrics(phase, historical_out, labels, loss, mode=\"historical\")\n self.process_metrics(phase, global_out, labels, loss, mode=\"global\")\n\n return loss\n\n def training_step(self, batch, batch_id):\n \"\"\"\n Training step for the model.\n Args:\n batch:\n batch_id:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_id, \"Train\")\n\n def on_train_epoch_end(self):\n self.log_metrics_by_epoch(\"Train\", print_cm=True, plot_cm=True, mode=\"local\")\n self.log_metrics_by_epoch(\"Train\", print_cm=True, plot_cm=True, mode=\"historical\")\n self.log_metrics_by_epoch(\"Train\", print_cm=True, plot_cm=True, mode=\"global\")\n\n def validation_step(self, batch, batch_idx):\n \"\"\"\n Validation step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx, \"Validation\")\n\n def on_validation_epoch_end(self):\n self.log_metrics_by_epoch(\"Validation\", print_cm=True, plot_cm=False, mode=\"local\")\n self.log_metrics_by_epoch(\"Validation\", print_cm=True, plot_cm=False, mode=\"historical\")\n self.log_metrics_by_epoch(\"Validation\", print_cm=True, plot_cm=False, mode=\"global\")\n\n def test_step(self, batch, batch_idx):\n \"\"\"\n Test step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx, \"Test\")\n\n def on_test_epoch_end(self):\n self.log_metrics_by_epoch(\"Test\", print_cm=True, plot_cm=True, mode=\"local\")\n self.log_metrics_by_epoch(\"Test\", print_cm=True, plot_cm=True, mode=\"historical\")\n self.log_metrics_by_epoch(\"Test\", print_cm=True, plot_cm=True, mode=\"global\")\n\n def save_historical_model(self):\n \"\"\"\n Save the current local model as the historical model.\n \"\"\"\n logging_training.info(\"Copying local model to historical model.\")\n self.historical_model.load_state_dict(self.model.state_dict())\n\n def global_load_state_dict(self, state_dict):\n \"\"\"\n Load the given state dictionary into the global model.\n Args:\n state_dict (dict): The state dictionary to load into the global model.\n \"\"\"\n logging_training.info(\"Loading state dict into global model.\")\n adapted_state_dict = self.adapt_state_dict_for_model(state_dict, \"model\")\n self.global_model.load_state_dict(adapted_state_dict)\n\n def historical_load_state_dict(self, state_dict):\n \"\"\"\n Load the given state dictionary into the historical model.\n Args:\n state_dict (dict): The state dictionary to load into the historical model.\n \"\"\"\n logging_training.info(\"Loading state dict into historical model.\")\n adapted_state_dict = self.adapt_state_dict_for_model(state_dict, \"model\")\n self.historical_model.load_state_dict(adapted_state_dict)\n\n def adapt_state_dict_for_model(self, state_dict, model_prefix):\n \"\"\"\n Adapt the keys in the provided state_dict to match the structure expected by the model.\n \"\"\"\n new_state_dict = {}\n prefix = f\"{model_prefix}.\"\n for key, value in state_dict.items():\n if key.startswith(prefix):\n # Remove the specific prefix from each key\n new_key = key[len(prefix) :]\n new_state_dict[new_key] = value\n return new_state_dict\n\n def get_global_model_parameters(self):\n \"\"\"\n Get the parameters of the global model.\n \"\"\"\n return self.global_model.state_dict()\n\n def print_summary(self):\n \"\"\"\n Print a summary of local, historical and global models to check if they are the same.\n \"\"\"\n logging_training.info(\"Local model summary:\")\n logging_training.info(self.model)\n logging_training.info(\"Historical model summary:\")\n logging_training.info(self.historical_model)\n logging_training.info(\"Global model summary:\")\n logging_training.info(self.global_model)\n
adapt_state_dict_for_model(state_dict, model_prefix)
Adapt the keys in the provided state_dict to match the structure expected by the model.
def adapt_state_dict_for_model(self, state_dict, model_prefix):\n \"\"\"\n Adapt the keys in the provided state_dict to match the structure expected by the model.\n \"\"\"\n new_state_dict = {}\n prefix = f\"{model_prefix}.\"\n for key, value in state_dict.items():\n if key.startswith(prefix):\n # Remove the specific prefix from each key\n new_key = key[len(prefix) :]\n new_state_dict[new_key] = value\n return new_state_dict\n
def configure_optimizers(self):\n \"\"\" \"\"\"\n optimizer = torch.optim.Adam(\n self.parameters(),\n lr=self.learning_rate,\n betas=(self.config[\"beta1\"], self.config[\"beta2\"]),\n amsgrad=self.config[\"amsgrad\"],\n )\n return optimizer\n
forward(x, mode='local')
def forward(self, x, mode=\"local\"):\n \"\"\"Forward pass of the model.\"\"\"\n if mode == \"local\":\n return self.model(x)\n elif mode == \"global\":\n return self.global_model(x)\n elif mode == \"historical\":\n return self.historical_model(x)\n else:\n raise NotImplementedError\n
get_global_model_parameters()
Get the parameters of the global model.
def get_global_model_parameters(self):\n \"\"\"\n Get the parameters of the global model.\n \"\"\"\n return self.global_model.state_dict()\n
global_load_state_dict(state_dict)
Load the given state dictionary into the global model. Args: state_dict (dict): The state dictionary to load into the global model.
def global_load_state_dict(self, state_dict):\n \"\"\"\n Load the given state dictionary into the global model.\n Args:\n state_dict (dict): The state dictionary to load into the global model.\n \"\"\"\n logging_training.info(\"Loading state dict into global model.\")\n adapted_state_dict = self.adapt_state_dict_for_model(state_dict, \"model\")\n self.global_model.load_state_dict(adapted_state_dict)\n
historical_load_state_dict(state_dict)
Load the given state dictionary into the historical model. Args: state_dict (dict): The state dictionary to load into the historical model.
def historical_load_state_dict(self, state_dict):\n \"\"\"\n Load the given state dictionary into the historical model.\n Args:\n state_dict (dict): The state dictionary to load into the historical model.\n \"\"\"\n logging_training.info(\"Loading state dict into historical model.\")\n adapted_state_dict = self.adapt_state_dict_for_model(state_dict, \"model\")\n self.historical_model.load_state_dict(adapted_state_dict)\n
log_metrics_by_epoch(phase, print_cm=False, plot_cm=False, mode='local')
Log all metrics at the end of an epoch for the given phase. Args: phase (str): One of 'Train', 'Validation', or 'Test' :param phase: :param plot_cm:
def log_metrics_by_epoch(self, phase, print_cm=False, plot_cm=False, mode=\"local\"):\n \"\"\"\n Log all metrics at the end of an epoch for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', or 'Test'\n :param phase:\n :param plot_cm:\n \"\"\"\n if mode == \"local\":\n print(f\"Epoch end: {mode} {phase}, epoch number: {self.local_epoch_global_number[phase]}\")\n elif mode == \"historical\":\n print(f\"Epoch end: {mode} {phase}, epoch number: {self.historical_epoch_global_number[phase]}\")\n elif mode == \"global\":\n print(f\"Epoch end: {mode} {phase}, epoch number: {self.global_epoch_global_number[phase]}\")\n\n if phase == \"Train\":\n if mode == \"local\":\n output = self.local_train_metrics.compute()\n self.local_train_metrics.reset()\n elif mode == \"historical\":\n output = self.historical_train_metrics.compute()\n self.historical_train_metrics.reset()\n elif mode == \"global\":\n output = self.global_train_metrics.compute()\n self.global_train_metrics.reset()\n elif phase == \"Validation\":\n if mode == \"local\":\n output = self.local_val_metrics.compute()\n self.local_val_metrics.reset()\n elif mode == \"historical\":\n output = self.historical_val_metrics.compute()\n self.historical_val_metrics.reset()\n elif mode == \"global\":\n output = self.global_val_metrics.compute()\n self.global_val_metrics.reset()\n elif phase == \"Test\":\n if mode == \"local\":\n output = self.local_test_metrics.compute()\n self.local_test_metrics.reset()\n elif mode == \"historical\":\n output = self.historical_test_metrics.compute()\n self.historical_test_metrics.reset()\n elif mode == \"global\":\n output = self.global_test_metrics.compute()\n self.global_test_metrics.reset()\n else:\n raise NotImplementedError\n\n output = {\n f\"{mode}/{phase}Epoch/{key.replace('Multiclass', '').split('/')[-1]}\": value\n for key, value in output.items()\n }\n\n self.log_dict(output, prog_bar=True, logger=True)\n\n if self.local_cm is not None and self.historical_cm is not None and self.global_cm is not None:\n if mode == \"local\":\n cm = self.local_cm.compute().cpu()\n elif mode == \"historical\":\n cm = self.historical_cm.compute().cpu()\n elif mode == \"global\":\n cm = self.global_cm.compute().cpu()\n print(f\"{mode}/{phase}Epoch/CM\\n\", cm) if print_cm else None\n if plot_cm:\n plt.figure(figsize=(10, 7))\n ax = sns.heatmap(cm.numpy(), annot=True, fmt=\"d\", cmap=\"Blues\")\n ax.set_xlabel(\"Predicted labels\")\n ax.set_ylabel(\"True labels\")\n ax.set_title(\"Confusion Matrix\")\n ax.set_xticks(range(self.num_classes))\n ax.set_yticks(range(self.num_classes))\n ax.xaxis.set_ticklabels([i for i in range(self.num_classes)])\n ax.yaxis.set_ticklabels([i for i in range(self.num_classes)])\n if mode == \"local\":\n self.logger.experiment.add_figure(\n f\"{mode}/{phase}Epoch/CM\",\n ax.get_figure(),\n global_step=self.local_epoch_global_number[phase],\n )\n elif mode == \"historical\":\n self.logger.experiment.add_figure(\n f\"{mode}/{phase}Epoch/CM\",\n ax.get_figure(),\n global_step=self.historical_epoch_global_number[phase],\n )\n elif mode == \"global\":\n self.logger.experiment.add_figure(\n f\"{mode}/{phase}Epoch/CM\",\n ax.get_figure(),\n global_step=self.global_epoch_global_number[phase],\n )\n plt.close()\n\n if mode == \"local\":\n self.local_epoch_global_number[phase] += 1\n elif mode == \"historical\":\n self.historical_epoch_global_number[phase] += 1\n elif mode == \"global\":\n self.global_epoch_global_number[phase] += 1\n
print_summary()
Print a summary of local, historical and global models to check if they are the same.
def print_summary(self):\n \"\"\"\n Print a summary of local, historical and global models to check if they are the same.\n \"\"\"\n logging_training.info(\"Local model summary:\")\n logging_training.info(self.model)\n logging_training.info(\"Historical model summary:\")\n logging_training.info(self.historical_model)\n logging_training.info(\"Global model summary:\")\n logging_training.info(self.global_model)\n
process_metrics(phase, y_pred, y, loss=None, mode='local')
Calculate and log metrics for the given phase. Args: phase (str): One of 'Train', 'Validation', or 'Test' y_pred (torch.Tensor): Model predictions y (torch.Tensor): Ground truth labels loss (torch.Tensor, optional): Loss value
def process_metrics(self, phase, y_pred, y, loss=None, mode=\"local\"):\n \"\"\"\n Calculate and log metrics for the given phase.\n Args:\n phase (str): One of 'Train', 'Validation', or 'Test'\n y_pred (torch.Tensor): Model predictions\n y (torch.Tensor): Ground truth labels\n loss (torch.Tensor, optional): Loss value\n \"\"\"\n\n y_pred_classes = torch.argmax(y_pred, dim=1)\n if phase == \"Train\":\n if mode == \"local\":\n output = self.local_train_metrics(y_pred_classes, y)\n elif mode == \"historical\":\n output = self.historical_train_metrics(y_pred_classes, y)\n elif mode == \"global\":\n output = self.global_train_metrics(y_pred_classes, y)\n elif phase == \"Validation\":\n if mode == \"local\":\n output = self.local_val_metrics(y_pred_classes, y)\n elif mode == \"historical\":\n output = self.historical_val_metrics(y_pred_classes, y)\n elif mode == \"global\":\n output = self.global_val_metrics(y_pred_classes, y)\n elif phase == \"Test\":\n if mode == \"local\":\n output = self.local_test_metrics(y_pred_classes, y)\n elif mode == \"historical\":\n output = self.historical_test_metrics(y_pred_classes, y)\n elif mode == \"global\":\n output = self.global_test_metrics(y_pred_classes, y)\n else:\n raise NotImplementedError\n # print(f\"y_pred shape: {y_pred.shape}, y_pred_classes shape: {y_pred_classes.shape}, y shape: {y.shape}\") # Debug print\n output = {\n f\"{mode}/{phase}/{key.replace('Multiclass', '').split('/')[-1]}\": value for key, value in output.items()\n }\n self.log_dict(output, prog_bar=True, logger=True)\n\n if self.local_cm is not None and self.historical_cm is not None and self.global_cm is not None:\n if mode == \"local\":\n self.local_cm.update(y_pred_classes, y)\n elif mode == \"historical\":\n self.historical_cm.update(y_pred_classes, y)\n elif mode == \"global\":\n self.global_cm.update(y_pred_classes, y)\n
save_historical_model()
Save the current local model as the historical model.
def save_historical_model(self):\n \"\"\"\n Save the current local model as the historical model.\n \"\"\"\n logging_training.info(\"Copying local model to historical model.\")\n self.historical_model.load_state_dict(self.model.state_dict())\n
test_step(batch, batch_idx)
def test_step(self, batch, batch_idx):\n \"\"\"\n Test step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx, \"Test\")\n
training_step(batch, batch_id)
def training_step(self, batch, batch_id):\n \"\"\"\n Training step for the model.\n Args:\n batch:\n batch_id:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_id, \"Train\")\n
def validation_step(self, batch, batch_idx):\n \"\"\"\n Validation step for the model.\n Args:\n batch:\n batch_idx:\n\n Returns:\n \"\"\"\n return self.step(batch, batch_idx, \"Validation\")\n
Generated protocol buffer code.
Lightning
nebula/core/training/lightning.py
class Lightning:\n DEFAULT_MODEL_WEIGHT = 1\n BYPASS_MODEL_WEIGHT = 0\n\n def __init__(self, model, data, config=None):\n # self.model = torch.compile(model, mode=\"reduce-overhead\")\n self.model = model\n self.data = data\n self.config = config\n self._trainer = None\n self.epochs = 1\n self.round = 0\n self.experiment_name = self.config.participant[\"scenario_args\"][\"name\"]\n self.idx = self.config.participant[\"device_args\"][\"idx\"]\n self.log_dir = os.path.join(self.config.participant[\"tracking_args\"][\"log_dir\"], self.experiment_name)\n self._logger = None\n self.create_logger()\n enable_deterministic(self.config)\n\n @property\n def logger(self):\n return self._logger\n\n def get_round(self):\n return self.round\n\n def set_model(self, model):\n self.model = model\n\n def set_data(self, data):\n self.data = data\n\n def create_logger(self):\n if self.config.participant[\"tracking_args\"][\"local_tracking\"] == \"csv\":\n nebulalogger = CSVLogger(f\"{self.log_dir}\", name=\"metrics\", version=f\"participant_{self.idx}\")\n elif self.config.participant[\"tracking_args\"][\"local_tracking\"] == \"basic\":\n logger_config = None\n if self._logger is not None:\n logger_config = self._logger.get_logger_config()\n nebulalogger = NebulaTensorBoardLogger(\n self.config.participant[\"scenario_args\"][\"start_time\"],\n f\"{self.log_dir}\",\n name=\"metrics\",\n version=f\"participant_{self.idx}\",\n log_graph=False,\n )\n # Restore logger configuration\n nebulalogger.set_logger_config(logger_config)\n elif self.config.participant[\"tracking_args\"][\"local_tracking\"] == \"advanced\":\n nebulalogger = NebulaLogger(\n config=self.config,\n engine=self,\n scenario_start_time=self.config.participant[\"scenario_args\"][\"start_time\"],\n repo=f\"{self.config.participant['tracking_args']['log_dir']}\",\n experiment=self.experiment_name,\n run_name=f\"participant_{self.idx}\",\n train_metric_prefix=\"train_\",\n test_metric_prefix=\"test_\",\n val_metric_prefix=\"val_\",\n log_system_params=False,\n )\n # nebulalogger_aim = NebulaLogger(config=self.config, engine=self, scenario_start_time=self.config.participant[\"scenario_args\"][\"start_time\"], repo=f\"aim://nebula-frontend:8085\",\n # experiment=self.experiment_name, run_name=f\"participant_{self.idx}\",\n # train_metric_prefix='train_', test_metric_prefix='test_', val_metric_prefix='val_', log_system_params=False)\n self.config.participant[\"tracking_args\"][\"run_hash\"] = nebulalogger.experiment.hash\n else:\n nebulalogger = None\n\n self._logger = nebulalogger\n\n def create_trainer(self):\n # Create a new trainer and logger for each round\n self.create_logger()\n num_gpus = torch.cuda.device_count()\n if self.config.participant[\"device_args\"][\"accelerator\"] == \"gpu\" and num_gpus > 0:\n gpu_index = self.config.participant[\"device_args\"][\"idx\"] % num_gpus\n logging_training.info(f\"Creating trainer with accelerator GPU ({gpu_index})\")\n self._trainer = Trainer(\n callbacks=[ModelSummary(max_depth=1), NebulaProgressBar()],\n max_epochs=self.epochs,\n accelerator=self.config.participant[\"device_args\"][\"accelerator\"],\n devices=[gpu_index],\n logger=self._logger,\n enable_checkpointing=False,\n enable_model_summary=False,\n # deterministic=True\n )\n else:\n logging_training.info(\"Creating trainer with accelerator CPU\")\n self._trainer = Trainer(\n callbacks=[ModelSummary(max_depth=1), NebulaProgressBar()],\n max_epochs=self.epochs,\n accelerator=self.config.participant[\"device_args\"][\"accelerator\"],\n devices=\"auto\",\n logger=self._logger,\n enable_checkpointing=False,\n enable_model_summary=False,\n # deterministic=True\n )\n logging_training.info(f\"Trainer strategy: {self._trainer.strategy}\")\n\n def validate_neighbour_model(self, neighbour_model_param):\n avg_loss = 0\n running_loss = 0\n bootstrap_dataloader = self.data.bootstrap_dataloader()\n num_samples = 0\n neighbour_model = copy.deepcopy(self.model)\n neighbour_model.load_state_dict(neighbour_model_param)\n\n # enable evaluation mode, prevent memory leaks.\n # no need to switch back to training since model is not further used.\n if torch.cuda.is_available():\n neighbour_model = neighbour_model.to(\"cuda\")\n neighbour_model.eval()\n\n # bootstrap_dataloader = bootstrap_dataloader.to('cuda')\n with torch.no_grad():\n for inputs, labels in bootstrap_dataloader:\n if torch.cuda.is_available():\n inputs = inputs.to(\"cuda\")\n labels = labels.to(\"cuda\")\n outputs = neighbour_model(inputs)\n loss = F.cross_entropy(outputs, labels)\n running_loss += loss.item()\n num_samples += inputs.size(0)\n\n avg_loss = running_loss / len(bootstrap_dataloader)\n logging_training.info(f\"Computed neighbor loss over {num_samples} data samples\")\n return avg_loss\n\n def get_hash_model(self):\n \"\"\"\n Returns:\n str: SHA256 hash of model parameters\n \"\"\"\n return hashlib.sha256(self.serialize_model(self.model)).hexdigest()\n\n def set_epochs(self, epochs):\n self.epochs = epochs\n\n def serialize_model(self, model):\n # From https://pytorch.org/docs/stable/notes/serialization.html\n try:\n buffer = io.BytesIO()\n with gzip.GzipFile(fileobj=buffer, mode=\"wb\") as f:\n torch.save(model, f, pickle_protocol=pickle.HIGHEST_PROTOCOL)\n serialized_data = buffer.getvalue()\n buffer.close()\n del buffer\n return serialized_data\n except Exception as e:\n raise ParameterSerializeError(\"Error serializing model\") from e\n\n def deserialize_model(self, data):\n # From https://pytorch.org/docs/stable/notes/serialization.html\n try:\n buffer = io.BytesIO(data)\n with gzip.GzipFile(fileobj=buffer, mode=\"rb\") as f:\n params_dict = torch.load(f)\n buffer.close()\n del buffer\n return OrderedDict(params_dict)\n except Exception as e:\n raise ParameterDeserializeError(\"Error decoding parameters\") from e\n\n def set_model_parameters(self, params, initialize=False):\n try:\n self.model.load_state_dict(params)\n except Exception as e:\n raise ParameterSettingError(\"Error setting parameters\") from e\n\n def get_model_parameters(self, bytes=False, initialize=False):\n if bytes:\n return self.serialize_model(self.model.state_dict())\n return self.model.state_dict()\n\n async def train(self):\n try:\n self.create_trainer()\n logging.info(f\"{'=' * 10} [Training] Started (check training logs for progress) {'=' * 10}\")\n await asyncio.to_thread(self._train_sync)\n logging.info(f\"{'=' * 10} [Training] Finished (check training logs for progress) {'=' * 10}\")\n except Exception as e:\n logging_training.error(f\"Error training model: {e}\")\n logging_training.error(traceback.format_exc())\n\n def _train_sync(self):\n try:\n self._trainer.fit(self.model, self.data)\n except Exception as e:\n logging_training.error(f\"Error in _train_sync: {e}\")\n tb = traceback.format_exc()\n logging_training.error(f\"Traceback: {tb}\")\n # If \"raise\", the exception will be managed by the main thread\n\n async def test(self):\n try:\n self.create_trainer()\n logging.info(f\"{'=' * 10} [Testing] Started (check training logs for progress) {'=' * 10}\")\n await asyncio.to_thread(self._test_sync)\n logging.info(f\"{'=' * 10} [Testing] Finished (check training logs for progress) {'=' * 10}\")\n except Exception as e:\n logging_training.error(f\"Error testing model: {e}\")\n logging_training.error(traceback.format_exc())\n\n def _test_sync(self):\n try:\n self._trainer.test(self.model, self.data, verbose=True)\n except Exception as e:\n logging_training.error(f\"Error in _test_sync: {e}\")\n tb = traceback.format_exc()\n logging_training.error(f\"Traceback: {tb}\")\n # If \"raise\", the exception will be managed by the main thread\n\n def cleanup(self):\n if self._trainer is not None:\n self._trainer._teardown()\n del self._trainer\n if self.data is not None:\n self.data.teardown()\n gc.collect()\n torch.cuda.empty_cache()\n\n def get_model_weight(self):\n weight = self.data.model_weight\n if weight is None:\n raise ValueError(\"Model weight not set. Please call setup('fit') before requesting model weight.\")\n return weight\n\n def on_round_start(self):\n self.data.setup()\n self._logger.log_data({\"Round\": self.round})\n # self.reporter.enqueue_data(\"Round\", self.round)\n\n def on_round_end(self):\n self._logger.global_step = self._logger.global_step + self._logger.local_step\n self._logger.local_step = 0\n self.round += 1\n self.model.on_round_end()\n logging.info(\"Flushing memory cache at the end of round...\")\n self.cleanup()\n\n def on_learning_cycle_end(self):\n self._logger.log_data({\"Round\": self.round})\n
get_hash_model()
SHA256 hash of model parameters
def get_hash_model(self):\n \"\"\"\n Returns:\n str: SHA256 hash of model parameters\n \"\"\"\n return hashlib.sha256(self.serialize_model(self.model)).hexdigest()\n
NebulaProgressBar
Bases: ProgressBar
ProgressBar
Nebula progress bar for training. Logs the percentage of completion of the training process using logging.
class NebulaProgressBar(ProgressBar):\n \"\"\"Nebula progress bar for training.\n Logs the percentage of completion of the training process using logging.\n \"\"\"\n\n def __init__(self, log_every_n_steps=100):\n super().__init__()\n self.enable = True\n self.log_every_n_steps = log_every_n_steps\n\n def enable(self):\n \"\"\"Enable progress bar logging.\"\"\"\n self.enable = True\n\n def disable(self):\n \"\"\"Disable the progress bar logging.\"\"\"\n self.enable = False\n\n def on_train_epoch_start(self, trainer, pl_module):\n \"\"\"Called when the training epoch starts.\"\"\"\n super().on_train_epoch_start(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Starting Epoch {trainer.current_epoch}\")\n\n def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx):\n \"\"\"Called at the end of each training batch.\"\"\"\n super().on_train_batch_end(trainer, pl_module, outputs, batch, batch_idx)\n if self.enable:\n if (batch_idx + 1) % self.log_every_n_steps == 0 or (batch_idx + 1) == self.total_train_batches:\n # Calculate percentage complete for the current epoch\n percent = ((batch_idx + 1) / self.total_train_batches) * 100 # +1 to count current batch\n logging_training.info(f\"Epoch {trainer.current_epoch} - {percent:.01f}% complete\")\n\n def on_train_epoch_end(self, trainer, pl_module):\n \"\"\"Called at the end of the training epoch.\"\"\"\n super().on_train_epoch_end(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Epoch {trainer.current_epoch} finished\")\n\n def on_validation_epoch_start(self, trainer, pl_module):\n super().on_validation_epoch_start(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Starting validation for Epoch {trainer.current_epoch}\")\n\n def on_validation_epoch_end(self, trainer, pl_module):\n super().on_validation_epoch_end(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Validation for Epoch {trainer.current_epoch} finished\")\n\n def on_test_batch_start(self, trainer, pl_module, batch, batch_idx, dataloader_idx):\n super().on_test_batch_start(trainer, pl_module, batch, batch_idx, dataloader_idx)\n if not self.has_dataloader_changed(dataloader_idx):\n return\n\n def on_test_batch_end(self, trainer, pl_module, outputs, batch, batch_idx, dataloader_idx):\n \"\"\"Called at the end of each test batch.\"\"\"\n super().on_test_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx)\n if self.enable:\n total_batches = self.total_test_batches_current_dataloader\n if total_batches == 0:\n logging_training.warning(\n f\"Total test batches is 0 for dataloader {dataloader_idx}, cannot compute progress.\"\n )\n return\n\n if (batch_idx + 1) % self.log_every_n_steps == 0 or (batch_idx + 1) == total_batches:\n percent = ((batch_idx + 1) / total_batches) * 100 # +1 to count the current batch\n logging_training.info(\n f\"Test Epoch {trainer.current_epoch}, Dataloader {dataloader_idx} - {percent:.01f}% complete\"\n )\n\n def on_test_epoch_start(self, trainer, pl_module):\n super().on_test_epoch_start(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Starting testing for Epoch {trainer.current_epoch}\")\n\n def on_test_epoch_end(self, trainer, pl_module):\n super().on_test_epoch_end(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Testing for Epoch {trainer.current_epoch} finished\")\n
disable()
Disable the progress bar logging.
def disable(self):\n \"\"\"Disable the progress bar logging.\"\"\"\n self.enable = False\n
enable()
Enable progress bar logging.
def enable(self):\n \"\"\"Enable progress bar logging.\"\"\"\n self.enable = True\n
on_test_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx)
Called at the end of each test batch.
def on_test_batch_end(self, trainer, pl_module, outputs, batch, batch_idx, dataloader_idx):\n \"\"\"Called at the end of each test batch.\"\"\"\n super().on_test_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx)\n if self.enable:\n total_batches = self.total_test_batches_current_dataloader\n if total_batches == 0:\n logging_training.warning(\n f\"Total test batches is 0 for dataloader {dataloader_idx}, cannot compute progress.\"\n )\n return\n\n if (batch_idx + 1) % self.log_every_n_steps == 0 or (batch_idx + 1) == total_batches:\n percent = ((batch_idx + 1) / total_batches) * 100 # +1 to count the current batch\n logging_training.info(\n f\"Test Epoch {trainer.current_epoch}, Dataloader {dataloader_idx} - {percent:.01f}% complete\"\n )\n
on_train_batch_end(trainer, pl_module, outputs, batch, batch_idx)
Called at the end of each training batch.
def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx):\n \"\"\"Called at the end of each training batch.\"\"\"\n super().on_train_batch_end(trainer, pl_module, outputs, batch, batch_idx)\n if self.enable:\n if (batch_idx + 1) % self.log_every_n_steps == 0 or (batch_idx + 1) == self.total_train_batches:\n # Calculate percentage complete for the current epoch\n percent = ((batch_idx + 1) / self.total_train_batches) * 100 # +1 to count current batch\n logging_training.info(f\"Epoch {trainer.current_epoch} - {percent:.01f}% complete\")\n
on_train_epoch_end(trainer, pl_module)
Called at the end of the training epoch.
def on_train_epoch_end(self, trainer, pl_module):\n \"\"\"Called at the end of the training epoch.\"\"\"\n super().on_train_epoch_end(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Epoch {trainer.current_epoch} finished\")\n
on_train_epoch_start(trainer, pl_module)
Called when the training epoch starts.
def on_train_epoch_start(self, trainer, pl_module):\n \"\"\"Called when the training epoch starts.\"\"\"\n super().on_train_epoch_start(trainer, pl_module)\n if self.enable:\n logging_training.info(f\"Starting Epoch {trainer.current_epoch}\")\n
ParameterDeserializeError
Bases: Exception
Exception
Custom exception for errors setting model parameters.
class ParameterDeserializeError(Exception):\n \"\"\"Custom exception for errors setting model parameters.\"\"\"\n
ParameterSerializeError
class ParameterSerializeError(Exception):\n \"\"\"Custom exception for errors setting model parameters.\"\"\"\n
ParameterSettingError
class ParameterSettingError(Exception):\n \"\"\"Custom exception for errors setting model parameters.\"\"\"\n
Siamese
nebula/core/training/siamese.py
class Siamese:\n def __init__(self, model, data, config=None, logger=None):\n # self.model = torch.compile(model, mode=\"reduce-overhead\")\n self.model = model\n self.data = data\n self.config = config\n self.logger = logger\n self.__trainer = None\n self.epochs = 1\n logging.getLogger(\"lightning.pytorch\").setLevel(logging.INFO)\n self.round = 0\n enable_deterministic(self.config)\n self.logger.log_data({\"Round\": self.round}, step=self.logger.global_step)\n\n @property\n def logger(self):\n return self._logger\n\n def get_round(self):\n return self.round\n\n def set_model(self, model):\n self.model = model\n\n def set_data(self, data):\n self.data = data\n\n def create_trainer(self):\n logging.info(\n \"[Trainer] Creating trainer with accelerator: {}\".format(\n self.config.participant[\"device_args\"][\"accelerator\"]\n )\n )\n progress_bar = RichProgressBar(\n theme=RichProgressBarTheme(\n description=\"green_yellow\",\n progress_bar=\"green1\",\n progress_bar_finished=\"green1\",\n progress_bar_pulse=\"#6206E0\",\n batch_progress=\"green_yellow\",\n time=\"grey82\",\n processing_speed=\"grey82\",\n metrics=\"grey82\",\n ),\n leave=True,\n )\n if self.config.participant[\"device_args\"][\"accelerator\"] == \"gpu\":\n # NEBULA uses 2 GPUs (max) to distribute the nodes.\n if self.config.participant[\"device_args\"][\"devices\"] > 1:\n # If you have more than 2 GPUs, you should specify which ones to use.\n gpu_id = ([1] if self.config.participant[\"device_args\"][\"idx\"] % 2 == 0 else [2],)\n else:\n # If there is only one GPU, it will be used.\n gpu_id = [1]\n\n self.__trainer = Trainer(\n callbacks=[RichModelSummary(max_depth=1), progress_bar],\n max_epochs=self.epochs,\n accelerator=self.config.participant[\"device_args\"][\"accelerator\"],\n devices=gpu_id,\n logger=self.logger,\n log_every_n_steps=50,\n enable_checkpointing=False,\n enable_model_summary=False,\n enable_progress_bar=True,\n # deterministic=True\n )\n else:\n # NEBULA uses only CPU to distribute the nodes\n self.__trainer = Trainer(\n callbacks=[RichModelSummary(max_depth=1), progress_bar],\n max_epochs=self.epochs,\n accelerator=self.config.participant[\"device_args\"][\"accelerator\"],\n devices=\"auto\",\n logger=self.logger,\n log_every_n_steps=50,\n enable_checkpointing=False,\n enable_model_summary=False,\n enable_progress_bar=True,\n # deterministic=True\n )\n\n def get_global_model_parameters(self):\n return self.model.get_global_model_parameters()\n\n def set_parameter_second_aggregation(self, params):\n try:\n logging.info(\"Setting parameters in second aggregation...\")\n self.model.load_state_dict(params)\n except:\n raise Exception(\"Error setting parameters\")\n\n def get_model_parameters(self, bytes=False):\n if bytes:\n return self.serialize_model(self.model.state_dict())\n else:\n return self.model.state_dict()\n\n def get_hash_model(self):\n \"\"\"\n Returns:\n str: SHA256 hash of model parameters\n \"\"\"\n return hashlib.sha256(self.serialize_model()).hexdigest()\n\n def set_epochs(self, epochs):\n self.epochs = epochs\n\n ####\n # Model parameters serialization/deserialization\n # From https://pytorch.org/docs/stable/notes/serialization.html\n ####\n def serialize_model(self, model):\n try:\n buffer = io.BytesIO()\n # with gzip.GzipFile(fileobj=buffer, mode='wb') as f:\n # torch.save(params, f)\n torch.save(model, buffer)\n return buffer.getvalue()\n except:\n raise Exception(\"Error serializing model\")\n\n def deserialize_model(self, data):\n try:\n buffer = io.BytesIO(data)\n # with gzip.GzipFile(fileobj=buffer, mode='rb') as f:\n # params_dict = torch.load(f, map_location='cpu')\n params_dict = torch.load(buffer, map_location=\"cpu\")\n return OrderedDict(params_dict)\n except:\n raise Exception(\"Error decoding parameters\")\n\n def set_model_parameters(self, params, initialize=False):\n try:\n if initialize:\n self.model.load_state_dict(params)\n self.model.global_load_state_dict(params)\n self.model.historical_load_state_dict(params)\n else:\n # First aggregation\n self.model.global_load_state_dict(params)\n except:\n raise Exception(\"Error setting parameters\")\n\n def train(self):\n try:\n self.create_trainer()\n # torch.autograd.set_detect_anomaly(True)\n # TODO: It is necessary to train only the local model, save the history of the previous model and then load it, the global model is the aggregation of all the models.\n self.__trainer.fit(self.model, self.data)\n # Save local model as historical model (previous round)\n # It will be compared the next round during training local model (constrantive loss)\n # When aggregation in global model (first) and aggregation with similarities and weights (second), the historical model keeps inmutable\n logging.info(\"Saving historical model...\")\n self.model.save_historical_model()\n except Exception as e:\n logging.exception(f\"Error training model: {e}\")\n logging.exception(traceback.format_exc())\n\n def test(self):\n try:\n self.create_trainer()\n self.__trainer.test(self.model, self.data, verbose=True)\n except Exception as e:\n logging.exception(f\"Error testing model: {e}\")\n logging.exception(traceback.format_exc())\n\n def get_model_weight(self):\n return (\n len(self.data.train_dataloader().dataset),\n len(self.data.test_dataloader().dataset),\n )\n\n def finalize_round(self):\n self.logger.global_step = self.logger.global_step + self.logger.local_step\n self.logger.local_step = 0\n self.round += 1\n self.logger.log_data({\"Round\": self.round}, step=self.logger.global_step)\n pass\n
def get_hash_model(self):\n \"\"\"\n Returns:\n str: SHA256 hash of model parameters\n \"\"\"\n return hashlib.sha256(self.serialize_model()).hexdigest()\n
nebula/frontend/app.py
def attack_node_assign(\n nodes,\n federation,\n attack,\n poisoned_node_percent,\n poisoned_sample_percent,\n poisoned_noise_percent,\n):\n \"\"\"Identify which nodes will be attacked\"\"\"\n import math\n import random\n\n attack_matrix = []\n n_nodes = len(nodes)\n if n_nodes == 0:\n return attack_matrix\n\n nodes_index = []\n # Get the nodes index\n if federation == \"DFL\":\n nodes_index = list(nodes.keys())\n else:\n for node in nodes:\n if nodes[node][\"role\"] != \"server\":\n nodes_index.append(node)\n\n n_nodes = len(nodes_index)\n # Number of attacked nodes, round up\n num_attacked = int(math.ceil(poisoned_node_percent / 100 * n_nodes))\n if num_attacked > n_nodes:\n num_attacked = n_nodes\n\n # Get the index of attacked nodes\n attacked_nodes = random.sample(nodes_index, num_attacked)\n\n # Assign the role of each node\n for node in nodes:\n node_att = \"No Attack\"\n attack_sample_persent = 0\n poisoned_ratio = 0\n if (node in attacked_nodes) or (nodes[node][\"malicious\"]):\n node_att = attack\n attack_sample_persent = poisoned_sample_percent / 100\n poisoned_ratio = poisoned_noise_percent / 100\n nodes[node][\"attacks\"] = node_att\n nodes[node][\"poisoned_sample_percent\"] = attack_sample_persent\n nodes[node][\"poisoned_ratio\"] = poisoned_ratio\n attack_matrix.append([node, node_att, attack_sample_persent, poisoned_ratio])\n return nodes, attack_matrix\n
def mobility_assign(nodes, mobile_participants_percent):\n \"\"\"Assign mobility to nodes\"\"\"\n import random\n\n # Number of mobile nodes, round down\n num_mobile = math.floor(mobile_participants_percent / 100 * len(nodes))\n if num_mobile > len(nodes):\n num_mobile = len(nodes)\n\n # Get the index of mobile nodes\n mobile_nodes = random.sample(list(nodes.keys()), num_mobile)\n\n # Assign the role of each node\n for node in nodes:\n node_mob = False\n if node in mobile_nodes:\n node_mob = True\n nodes[node][\"mobility\"] = node_mob\n return nodes\n
NEBULA: A Platform for Decentralized Federated Learning
federatedlearning.inf.um.es
NEBULA is developed by Enrique Tom\u00e1s Mart\u00ednez Beltr\u00e1n in collaboration with the University of Murcia, armasuisse, and the University of Zurich.
@article{MartinezBeltran:DFL:2023,\n title = {{Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and Challenges}},\n author = {Mart{\\'i}nez Beltr{\\'a}n, Enrique Tom{\\'a}s and Quiles P{\\'e}rez, Mario and S{\\'a}nchez S{\\'a}nchez, Pedro Miguel and L{\\'o}pez Bernal, Sergio and Bovet, G{\\'e}r{\\^o}me and Gil P{\\'e}rez, Manuel and Mart{\\'i}nez P{\\'e}rez, Gregorio and Huertas Celdr{\\'a}n, Alberto},\n year = 2023,\n volume = {25},\n number = {4},\n pages = {2983-3013},\n journal = {IEEE Communications Surveys & Tutorials},\n doi = {10.1109/COMST.2023.3315746},\n preprint = {https://arxiv.org/abs/2211.08413}\n}\n
@article{MartinezBeltran:fedstellar:2024,\n title = {{Fedstellar: A Platform for Decentralized Federated Learning}},\n author = {Mart{\\'i}nez Beltr{\\'a}n, Enrique Tom{\\'a}s and Perales G{\\'o}mez, {\\'A}ngel Luis and Feng, Chao and S{\\'a}nchez S{\\'a}nchez, Pedro Miguel and L{\\'o}pez Bernal, Sergio and Bovet, G{\\'e}r{\\^o}me and Gil P{\\'e}rez, Manuel and Mart{\\'i}nez P{\\'e}rez, Gregorio and Huertas Celdr{\\'a}n, Alberto},\n year = 2024,\n volume = {242},\n issn = {0957-4174},\n pages = {122861},\n journal = {Expert Systems with Applications},\n doi = {10.1016/j.eswa.2023.122861},\n preprint = {https://arxiv.org/abs/2306.09750}\n}\n
@inproceedings{MartinezBeltran:fedstellar_demo:2023,\n title = {{Fedstellar: A Platform for Training Models in a Privacy-preserving and Decentralized Fashion}},\n author = {Mart{\\'i}nez Beltr{\\'a}n, Enrique Tom{\\'a}s and S{\\'a}nchez S{\\'a}nchez, Pedro Miguel and L{\\'o}pez Bernal, Sergio and Bovet, G{\\'e}r{\\^o}me and Gil P{\\'e}rez, Manuel and Mart{\\'i}nez P{\\'e}rez, Gregorio and Huertas Celdr{\\'a}n, Alberto},\n year = 2023,\n month = aug,\n booktitle = {Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, {IJCAI-23}},\n publisher = {International Joint Conferences on Artificial Intelligence Organization},\n pages = {7154--7157},\n doi = {10.24963/ijcai.2023/838},\n note = {Demo Track},\n editor = {Edith Elkind}\n}\n
@article{MartinezBeltran:DFL_mitigating_threats:2023,\n title = {{Mitigating Communications Threats in Decentralized Federated Learning through Moving Target Defense}},\n author = {Mart{\\'i}nez Beltr{\\'a}n, Enrique Tom{\\'a}s and S{\\'a}nchez S{\\'a}nchez, Pedro Miguel and L{\\'o}pez Bernal, Sergio and Bovet, G{\\'e}r{\\^o}me and Gil P{\\'e}rez, Manuel and Mart{\\'i}nez P{\\'e}rez, Gregorio and Huertas Celdr{\\'a}n, Alberto},\n year = 2024,\n journal = {Wireless Networks},\n doi = {10.1007/s11276-024-03667-8}\n preprint = {https://arxiv.org/abs/2307.11730}\n}\n
Fedstellar was our first version of the platform. We have redesigned the previous functionalities and added new capabilities based on our research. The platform is now called NEBULA and is available as an open-source project.\u00a0\u21a9