Network architecture

Choosing a network card

Depending on your needs, the network card will have to support several features

  • PTP

  • PCI-passthrough

  • SR-IOV

None of these features are required by SEAPATH, however, PTP is required by the IEC 61850 standard.

If your machine doesn't have enough interfaces, you can use an usb-ethernet interface. For example, the TRENDnet TU3-ETG was tested and does work natively on SEAPATH. These added interfaces will not be compatible with any of the features described above.

SEAPATH network on one hypervisor

The following description fits the case of a standalone hypervisor, but is also applicable to a machine in the cluster.

Physical interfaces

There are four different types of interfaces to consider on a SEAPATH hypervisor:

  • Administration: This interface is used to access the administration user with ssh and configure the hypervisor. It is also used by ansible.

  • VM administration: This interface has the same role of the previous one, but for the virtual machines.

  • PTP interface: This interface is used to transmit PTP frames and synchronize the time on the machine.

  • IEC61850 interface: This interface will receive the sample values and GOOSE messages.

VM interfaces

There are three different ways to transmit data from a physical network interface to a virtual machine:

  • virtio: A virtual NIC (Network Interface Card) is created inside the virtual machine. The data received on the hypervisor will be treated by the physical network card and then pass to the VM virtual card. This is the simplest method, but also the slowest.

  • PCI passthrough: The control of the entire physical network card is given to the VM. The received data is then directly treated by the virtual machine and does not go through the hypervisor. This method is much faster, but has two drawbacks:

    • The entire physical NIC is given to one VM. Other VMs or the hypervisor can’t use it anymore

    • It requires a compatible NIC

  • SR-IOV: Single Root I/O Virtualization is a PCI Express Extended capability which makes one physical device appear as multiple virtual devices (more details here). Multiple VMs can now use the same physical NIC. This is the fastest method, but it requires a compatible NIC.

Recommendations

Many network configurations are possible using SEAPATH. Here are our recommendations:

Administration

To avoid using too many interfaces, the administration of all VMs can use the same physical NIC. Hypervisor administration can also be done on this interface.
This is possible by using a Linux or OVS bridge and connecting all VMs and the hypervisor to it.

This behavior is achieved by the br0 bridge, preconfigured by Ansible (see Ansible inventories examples)

IEC 61-850 traffic

The Sample values and GOOSE messages should be received on an interface using PCI-passthrough or SR-IOV, the classic virtio interface is not fast enough.
In this situation, all VMs receiving IEC 61-850 data must have one dedicated interface.

For testing purposes, to avoid using one interface per VM, these data can be received on classic interfaces using virtio. This is done by using a Linux or OVS bridge connected both to the physical NIC and to all the virtual machines.

See the variables `bridges` and ovs_bridges` in Ansible inventories examples.

PTP

PTP synchronization does not generate much data. The synchronization of the hypervisor can thus be used:

  • Alone on a specific interface

  • On the administration interface

Please remember that PTP requires specific NIC support.

In order to synchronize the virtual machines, the PTP clock of the host can be used (ptp_kvm). It is also possible to pass the PTP clock on the PCI-passthrough/SR-IOV interface given to the VM.

See the Time synchronization page for more details.

Example of configuration

Below is an example of a SEAPATH machine with two VMs. VM1 receives SV/GOOSE and must thus be synchronized with PTP and use PCI-passthrough. VM2 is just for monitoring, and does not require either PTP or PCI-passthrough.

  • One interface is used for both VM and hypervisor administration

  • One interface to synchronize the host and VM1 in PTP

  • One interface for PCI-passthrough in VM1 handling IEC 61-850

Connecting machines in a cluster

SEAPATH cluster is used to ensure redundancy of the machines (if one hypervisor fails) and of the network (if one link fails).

In order to connect all the machines together, two options are available:

  • Connecting all machines to two switches (the second is needed to ensure redundancy if the first switch fails)

  • Connecting all machines in a ring architecture

The second method is recommended on SEAPATH because it doesn’t use an additional device (the switch). In that case, every machine is connected to it’s neighbors, forming a triangle. If one link breaks, the paquets still have another route to reach the targeted machine. See on Github for more information.

In that case, two more interfaces per machines are needed, which leads to a minimum of four interfaces per machine (two for the cluster, one for the administration, and one for IEC 61-850 traffic)

Observer 

An observer machine doesn’t run any virtual machines. It is only used in the cluster to discriminate if a problem occurs between the two hypervisors.

On this machine, no IEC 61-850 paquets will be received.
It can be synchronized in PTP but it is not needed; it can stick to the simple NTP synchronization.

Even if PTP is not required, time synchronization must be ensured, at least in NTP. Otherwise, the machines will not be able to form the SEAPATH cluster.

So, on an observer, only three interfaces are needed : two for the cluster and one for administration.

Administration machine

In order to configure all the machines in the cluster, we advise using a switch connected to the administration machine.

Below is an example with two hypervisors and one observer, all connected together in a triangle cluster. The administration machine is connected to them using a switch.

Receiving Sample values and PTP frames

Machines

The standard way to generate sample values is to use a merging unit. For PTP, it is a Grand master clock.

However, in a test environment, these two machines can be simulated:

For this section example, we will use only one machine to simulate both PTP frames and IEC 61-850 traffic. This machine will be named “Publisher”.

Connections

In order to connect the publisher (or the merging unit/Grand master clock), a direct connection, with only one cable is the best. It avoids unnecessary latency, but necessitates having a machine with many interfaces.

A switch can then be used between the publisher and the cluster machine. In this situation, we advise you to use a separate switch from the one used for administration. It is possible to use only one switch for both, but in that case, the two networks must be separated using VLAN.

Note: To manage PTP, the best is to use a PTP-compliant switch. See more information on the Time synchronization page

Below is a schema of a cluster with two hypervisors and an observer. The administration network is isolated on his switch. Another switch is used for PTP and IEC 61-850 traffic. Remember that if you use PCI-passthrough, you will need as many interfaces as you have Virtual machines.

To avoid PTP frames interrupting with IEC 61-850 traffic, the best is to isolate PTP frames on a VLAN. This is done with the variable `ptp_vlanid` in the ansible inventories.

Reconfiguring the network entirely with netplan

It is possible to redefine the SEAPATH network entirely using netplan.io yaml files. This behavior is achieved by the variable netplan_configurations  in the inventory. Examples are provided in the Ansible repository.

Below are some useful examples for using netplan:

For now, using netplan disables all network configuration on SEAPATH. Everything, including the administration and cluster network, has to be handled by netplan yaml files.

The goal would be to be able to define the network entirely, either with systemd-networkd directly or with netplan, but this is not done now.

OpenVSwitch management on a standalone Seapath configuration

Creating an OVS bridge with Netplan

Seapath supports the creation of an OVS bridge directly from a Netplan configuration file. To do so, you can use the Netplan configuration example file located in Seapath Ansible inventories directory.

In the following example, we create an OVS bridge connected to the physical interface pointed by ovs_ext_interface inventory variable, defined in the hypervisor inventory. All the network stream coming from this interface will be redirected to this bridge. This configuration can be useful in the case you want to share the network stream coming on one interface to multiple virtual port, and so virtual machines.

# Copyright (C) 2024 Savoir-faire Linux, Inc.
# SPDX-License-Identifier: Apache-2.0
network:
version: 2
renderer: networkd
openvswitch:
protocols: [OpenFlow13, OpenFlow14, OpenFlow15]
ethernets:
{{ ovs_ext_interface }}: {}
bridges:
ovs0:
interfaces: [{{ ovs_ext_interface }}]
openvswitch:
protocols: [OpenFlow10, OpenFlow11, OpenFlow12]

After applying the Netplan configuration using netplan apply command, you can identify the created OVS bridge (here ovs0) with command ip a:

root@minisforum:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 58:47:ca:72:49:50 brd ff:ff:ff:ff:ff:ff
3: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 58:47:ca:72:49:51 brd ff:ff:ff:ff:ff:ff
4: enx782d7e14367c: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
link/ether 78:2d:7e:14:36:7c brd ff:ff:ff:ff:ff:ff
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether fa:f7:2f:2c:1b:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.216.74/24 brd 192.168.216.255 scope global br0
valid_lft forever preferred_lft forever
8: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 38:d5:7a:25:d4:2d brd ff:ff:ff:ff:ff:ff
9: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
link/ether 02:42:a2:71:7b:7e brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
10: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 1e:86:fb:0e:ff:7c brd ff:ff:ff:ff:ff:ff
11: ovs0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 58:47:ca:72:49:51 brd ff:ff:ff:ff:ff:ff

You can also confirm the creation of the OVS bridge using ovs-vsctl show:

root@minisforum:~# ovs-vsctl show
db96c66c-2cfb-40d0-bbcb-10038eb1e6fb
Bridge ovs0
fail_mode: standalone
Port enp3s0
Interface enp3s0
Port ovs0
Interface ovs0
type: internal
ovs_version: "3.1.0"

enp3s0 is the interface pointed by ovs_ext_interface.

Creating an OVS virtual port with Libvirt

Seapath supports also the creation of an OVS virtual port directly from the VM Libvirt XML configuration file. To do so, you can reuse the templated VM file guest.xml.j2 which supports this feature.

{% if vm.bridges is defined %}
{% for bridge in vm.bridges %}
<interface type="bridge">
<source bridge="{{ bridge.name }}"/>
<mac address="{{ bridge.mac_address }}"/>
<model type="virtio"/>
{% if bridge.type is defined %}
<virtualport type='{{ bridge.type }}'/>
{% endif %}
{% if bridge.vlan is defined %}
<vlan>
<tag id='{{ bridge.vlan.vlan_tag }}'/>
</vlan>
{% endif %}
</interface>
{% endfor %}
{% endif %}

Then, in your VM inventory, you have to add the OVS bridge in a bridge field and specifies the name, it MAC address, and finally the type of bridge to openvswitch. This type will tell to Libvirt that the selected bridge in an OpenVSwitch bridge, and so will add a virtual port connected to it. Based on the previous example, all network stream coming from the interface pointed by ovs_ext_interface will be redirected to this new virtual port.

bridges:
- name: "ovs0" 
mac_address: "58:47:ca:72:49:51"
type: openvswitch

After VM creation and starting this new virtual port will appeared on hypervisor side (here vnet0) with command ip a:

root@minisforum:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 58:47:ca:72:49:50 brd ff:ff:ff:ff:ff:ff
3: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 58:47:ca:72:49:51 brd ff:ff:ff:ff:ff:ff
4: enx782d7e14367c: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
link/ether 78:2d:7e:14:36:7c brd ff:ff:ff:ff:ff:ff
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether fa:f7:2f:2c:1b:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.216.74/24 brd 192.168.216.255 scope global br0
valid_lft forever preferred_lft forever
8: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 38:d5:7a:25:d4:2d brd ff:ff:ff:ff:ff:ff
9: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
link/ether 02:42:a2:71:7b:7e brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
10: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 1e:86:fb:0e:ff:7c brd ff:ff:ff:ff:ff:ff
11: ovs0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 58:47:ca:72:49:51 brd ff:ff:ff:ff:ff:ff
12: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UNKNOWN group default qlen 1000
link/ether fe:47:ca:72:49:51 brd ff:ff:ff:ff:ff:ff

You may need to isolate the network stream for a specific virtual port, and so create a specific VLAN for it. To do so, you can add in the VM inventory bridge field a VLAN field such as follow:

bridges:
- name: "ovs0" # Change the bridge name
mac_address: "58:47:ca:72:49:51" 
type: openvswitch
vlan: 100

In this example, the vnet0 OVS virtual port will be isolated on VLAN 100.

Seapath Cluster network

Seapath default recommended architecture is to use 3 hosts, directly interconnected to each other in a loop topology ("cluster network"). This network will be used for:

  • ceph clustering communication (public and private network)
  • corosync clustering communication (used by pacemaker)
  • vxlan trafic to extend the VM bridges topology

It would also be possible to use external switches for this cluster network, but the high availability requirements would demand to use 2 switches (in case one fails), interconnected together (2 times in case one link fails), to connect every host to both switches and to use a mecanism like LACP to deal with multiple paths.

We think that for a 3-node cluster, a no-switch architecture is simple to create, manage, debug, etc.


Bridge architecture

We propose a layer 2 networking solution (using OpenVSwitch since it's included in the seapath distribution), where each host runs a software bridge (we call it "team0"), with 2 ports connected to external network interfaces. Each of the 2 network interfaces is connected to one of the other host. The ip address for the host on this cluster network is directly set to the team0 bridge:



Loop prevention, RSTP

This will create an extended L2 network accross the 3 nodes, each node having a L3 ip address for all the mentionned communications, however this will create an obvious loop. L2 loops needs to be dealt with using some form of spanning tree protocol. We propose to use RSTP. In this example we set a smaller rstp priority on the host1, which will make rstp cut the loop between the


Cabling


To simplify the cabling engineering, we recommend:
  • thinking of the 3 hosts as "1, 2, 3", following each other in the loop "1 --> 2 --> 3 --> 1"
    • if N=1 then N+1=2
    • if N=2 then N+1=3
    • if N=3 then N+1=1
  • calling the 2 team0 bridge physical interfaces "team0_0" and "team0_1"
  • on host N, team0_0 should be connected to team0_1 of host N+1
  • consequently on host N, team0_1 is connected to team0_0 of host N-1