EP-3738353-B1 - METHODS AND APPARATUS FOR ROAMING BETWEEN WIRELESS COMMUNICATIONS NETWORKS

EP3738353B1EP 3738353 B1EP3738353 B1EP 3738353B1EP-3738353-B1

Inventors

KVERNVIK, TOR
ISAKSSON, MARTIN
OLSSON, HJALMAR

Dates

Publication Date: 20260506
Application Date: 20180112

Claims (9)

A method (200) performed by a first wireless device, the first wireless device being served by a first wireless access point in a first wireless communications network, the first wireless communications network being operated by a first network operator, the method comprising: acquiring (202) a determination from a first reinforcement learning agent of whether to roam from the first wireless access point to a second wireless access point in a second wireless communications network, the second wireless communications network being operated by a second network operator, wherein the first reinforcement learning agent shares a reward function with a second reinforcement learning agent, the first reinforcement learning agent being comprised in the first wireless device and the second reinforcement learning agent being associated with a second wireless device, and wherein the first wireless device and the second wireless device form part of a group of wireless devices, and the shared reward function stored in a node is shared between the wireless devices in the group of wireless devices; and roaming (204) from the first wireless access point to the second wireless access point, based on the determination.
A method as in claim 1 wherein the devices in the group of wireless devices have at least one common connection parameter.
A method as in claim 1 or 2, wherein the first reinforcement learning agent receives a parameter indicative of a positive reward if: the second wireless access point is in a home network associated with the first wireless device; and/or roaming to the second wireless access point from the first wireless access point improves connectivity of the first wireless device.
A method as in any one of claims 1 to 3 wherein the first reinforcement learning agent receives a parameter indicative of a negative reward when: the first wireless device roams to the second wireless access point in the second network; roaming to the second wireless access point decreases the connectivity of the first wireless device; roaming leads to a loss of connectivity of the first wireless device; and/or when an inter-network operator handover procedure is performed.
A method performed by a node of a wireless communications network, the method comprising: allocating (802) a parameter indicative of a reward according to a first reward function to a first reinforcement learning agent being comprised in a first wireless device based on an action determined by the first reinforcement learning agent, the action comprising providing an instruction to the first wireless device served by a first wireless access point in a first wireless communications network operated by a first network operator, the instruction instructing the first wireless device to roam from the first wireless access point to a second wireless access point in a second wireless communications network operated by a second network operator, wherein the first wireless device is part of a first group of wireless devices and wherein the method further comprising: allocating a parameter indicative of a reward to another wireless device in the first group of wireless devices using the first reward function.
A method as in claim 5 wherein the wireless devices in the first group of wireless devices have at least one common connection parameter.
A method as in any one of claims 5 to 6 further comprising: allocating a parameter indicative of a reward to a third reinforcement learning agent based on an action determined by the third reinforcement learning agent for a third wireless device, wherein the third wireless device is part of a second group of wireless devices, and allocating a parameter indicative of a reward using a second reward function, the second reward function being different to the first reward function.
A first wireless device (100), the first wireless device being connected to a first wireless access point in a first wireless communications network, the first wireless communications network being operated by a first network operator, the first wireless device comprising a processor (102) and a memory (104), said memory containing instructions executable by said processor whereby said first wireless device is operative to: acquire a determination from a first reinforcement learning agent (106) of whether to roam from the first wireless access point to a second wireless access point in a second wireless communications network, the second wireless communications network being operated by a second network operator, wherein the first reinforcement learning agent (106) shares a reward function with a second reinforcement learning agent, the first reinforcement learning agent being comprised in the first wireless device and the second reinforcement learning agent being associated with a second wireless device, and wherein the first wireless device and the second wireless device form part of a group of wireless devices, and the shared reward function stored in a node is shared between the wireless devices in the group of wireless devices; and roam from the first wireless access point to the second wireless access point, based on the determination.
A node (700) in a wireless communications network, the node comprising a processor (702) and a memory (704), said memory containing instructions executable by said processor whereby said node is operative to: allocate a parameter indicative of a reward according to a first reward function to a first reinforcement learning agent (706) being comprised in a first wireless device based on an action determined by the first reinforcement learning agent, the action comprising providing an instruction to the first wireless device served by a first wireless access point in a first wireless communications network operated by a first network operator, the instruction instructing the first wireless device to roam from the first wireless access point to a second wireless access point in a second wireless communications network operated by a second network operator, wherein the first wireless device is part of a first group of wireless devices and wherein the node (700) is further operative to : allocate a parameter indicative of a reward to another wireless device in the first group of wireless devices using the first reward function.

Description

TECHNICAL FIELD Embodiments herein relate to methods and devices in a wireless communications network. More particularly but non-exclusively, embodiments herein relate to the use of reinforcement learning agents when roaming between wireless communications networks. BACKGROUND This disclosure generally relates to roaming in wireless communications networks. Connectivity is crucial for many mobile devices yet many geographical areas have limited or even no connectivity (e.g. connectivity black spots or "black holes"). Internet of Things (IoT) devices may require reliable connectivity, possibly all the time. Generally, connectivity may vary between different wireless communications networks (that may be run by different operators) and geographical areas. For example, a wireless communications network run by operator A may provide better connectivity at location A than a wireless communications network run by operator B or vice versa. Moving wireless devices present an additional challenge. In order to maximize the connectivity for moving users there may be a need to roam (e.g. transfer service) between wireless communications networks. There may be a cost associated with roaming. For example, this may comprise a temporary reduction of connectivity as service is transferred, or a monetary cost associated with transferring service to the new wireless communications network. There may also be a need to roam when, for example, a stationary device is impacted by traffic load, weather conditions or configuration changes in an operator's network. Currently, few markets support national roaming. For example, wireless devices may be provided with modems having multiple SIM cards, each SIM card being associated with a different wireless communication network. Roaming between operators may typically be performed by vendor (e.g. operator) specific methods. For example, roaming may be controlled manually by the user or controlled by software in the modems. Such software solutions may be vendor specific and based on hardcoded criteria (e.g. "switch between operators when the connectivity is beyond a threshold"). Roaming between operators may also be device-vendor specific, for example, a modem may comprise more than one SIM card. In such a scenario, the modem may determine when to switch between subscriptions (e.g. using vendorspecific methods) without any input from a network operator. When a decision is taken to roam between two wireless communications networks, there are several methods to enforce the roaming, as follows. eSIM roaming (also known as soft SIM or embedded universal integrated-circuit card, eUICC) is a secure element designed to remotely manage multiple mobile network operator subscriptions and be compliant with Global System Mobile Association (GSMA) specifications. This makes it possible to remotely change wireless communications network (e.g. network operator) by alternating between them e.g. moving from one operator to the other without changing SIM card. The wireless communications network profile is changed on the device without being recognized by the device. The functionality of an embedded SIM may be the same as a traditional SIM, but a provisioning profile is assigned during manufacturing that allows a Subscription Manager to download and manage 'operational profiles' on the eUICC. For example, the subscription manager may manage profiles PF-A and PF-B, e.g. profiles for wireless communications networks run by operators A and B respectively. In national roaming, a wireless device roams between operators in the same country. The subscriber has a Home Public Land Mobile Network (HPLMN) but can roam to other wireless communications networks, described as Visited Public Land Mobile Networks (VPLMNs). In this way, a wireless device may thereby temporarily switch between different mobile networks using a single subscription. Charging and authentication is handled by the HPLMN. In multiple modems roaming, a wireless device (such as a machine to machine (M2M) device) has multiple modems that can be switched between when the user wants to change wireless communications network. The switch may be controlled by a subscription manager. In multiple modems roaming, separate modems and subscriptions are required. WO 2012/073059 A1 discloses a method for enabling usage of different resources of different network operators in different duplex directions. A UE may be optimized to determine which network operator and/or which channel or frequency of a network operator to utilize for connectivity by reinforcement learning. US 2012/108206 A1 discloses a method that allows a device to migrate wireless service across multiple wireless networks. WO 2015/157933 A1 discloses a method for dynamic VSIM provisioning on a multi-SIM wireless device having a first SIM as a Universal Integrated Circuit Card (UICC) and a virtual SIM (VSIM) stored on an embedded UICC (eUICC). SUMMARY As described above, maintaining connectivity of