CN-121993892-A - Air conditioner control method, device, air conditioner, storage medium and program product

CN121993892ACN 121993892 ACN121993892 ACN 121993892ACN-121993892-A

Abstract

The invention provides a control method and device of an air conditioner, the air conditioner, a storage medium and a computer program product, wherein the method comprises the steps of obtaining environmental state information, wherein the environmental state information comprises a temperature value, a humidity value and a door and window state value of the current environment of the air conditioner and a water accumulation depth value of condensed water generated by the air conditioner; the method comprises the steps of inputting environment state information into a condensate water dynamic regulation strategy network obtained through pre-training, determining a target working state of an air conditioner according to the environment state information by the condensate water dynamic regulation strategy network, wherein the target working state is used for controlling the water accumulation depth of condensate water generated by the air conditioner to be within a preset water accumulation depth threshold value and controlling the temperature change of the current environment of the air conditioner to be within a preset range, and controlling the air conditioner to work according to the target working state. The target working state of the air conditioner is determined through the environmental state information, the accumulated water depth of the condensed water can be effectively controlled to be below a safety threshold, and the temperature variation fluctuation is within a preset range, so that the comfort level of a user is ensured.

Inventors

XIE FANGXIN
LIU JIE
PANG WEI
CHENG YU
HUANG ZHIXIN

Assignees

珠海格力电器股份有限公司

Dates

Publication Date: 20260508
Application Date: 20251222

Claims (12)

1. A control method of an air conditioner, the method comprising: acquiring environmental state information, wherein the environmental state information comprises a temperature value, a humidity value and a door and window state value of the current environment of an air conditioner and a water accumulation depth value of condensed water generated by the air conditioner; Inputting the environment state information into a condensate water dynamic regulation strategy network obtained through pre-training, so that the condensate water dynamic regulation strategy network determines a target working state of the air conditioner according to the environment state information, wherein the target working state is used for controlling the water accumulation depth of condensate water generated by the air conditioner to be within a preset water accumulation depth threshold value and controlling the temperature change of the current environment of the air conditioner to be within a preset range; and controlling the air conditioner to work according to the target working state.
2. The control method according to claim 1, wherein the target operation state of the air conditioner includes a compressor power of the air conditioner and/or a fan rotation speed of the air conditioner; determining a target working state of the air conditioner, including: Determining a compressor power of the air conditioner, and/or determining a fan speed of the air conditioner.
3. The control method according to claim 1 or 2, wherein the condensed dynamic adjustment policy network is obtained by training an initial near-end policy optimization network by: Determining an objective function, a constraint condition, a state space, an action space and a reward function of an initial near-end strategy optimization network, wherein the objective function is used for defining a target to be reached by training the initial near-end strategy optimization network, the constraint condition is used for constraining the temperature change of the current environment of the air conditioner to be within a preset range, the state space is used for defining an environment parameter set, the action space is an executable control action set, and the reward function is used for representing instant benefits obtained under a specified action; Training an initial near-end strategy optimization network, updating the initial near-end strategy optimization network through an advantage function and a clipping loss function until convergence conditions are met, and taking the near-end strategy optimization network during convergence as the condensed water dynamic regulation strategy network.
4. A control method according to claim 3, wherein determining the objective function, constraints of the initial near-end policy-optimized network comprises: Determining an objective function and constraint conditions of the initial near-end policy optimization network through the following formula, wherein: The objective function is: ; Wherein, the The desired value is indicated to be the desired value, As a discount factor, the number of times the discount is calculated, Is a reward function; the constraint conditions are as follows: ; Wherein, the A constraint mechanism for representing an action space for the constraint condition; a is used to represent an action and, For the purpose of indicating the compressor power, To set the compressor power minimum corresponding to the temperature value, For the maximum value of the compressor power corresponding to the set temperature value, fan is used to represent the fan rotation speed, f1 is the minimum value of the fan rotation speed corresponding to the set temperature value, and f2 is the maximum value of the fan rotation speed corresponding to the set temperature value.
5. A control method according to claim 3, wherein determining the state space, action space and rewards function of the initial near-end policy optimization network comprises: Determining a state space, an action space and a reward function of the initial near-end policy optimization network by the following formula, wherein: The state space is as follows: ; Wherein, the For the purpose of representing the current environmental state, Used for representing indoor relative humidity and divided into low, medium and high levels, Is used for representing the indoor temperature and is divided into low, medium and high levels, Is used for representing the historical ponding depth and is divided into low, medium and high levels, For indicating the status of the door and window; The state transition process corresponding to the state space is expressed as follows: ; Wherein, the For indicating the state of the environment at the next moment, As a function of the dynamics of the system, In order to represent the action performed, As an environmental disturbance, the environmental disturbance comprises door and window state change; the operation mode of the parameters of the action space is as follows: ; ; Wherein, the For the purpose of indicating the target power of the compressor, For indicating the current power of the compressor, For the purpose of indicating the amount of compressor power adjustment, Corresponding to a first stepping mode; for representing the target power of the fan speed, For indicating the current power of the fan speed, For indicating the adjustment amount of the rotation speed of the fan, Corresponding to a second stepping mode; The reward function is: ; ; ; Wherein, the For the purpose of representing a bonus function, Indicated as a water accumulation depth prize, As the weight coefficient of the light-emitting diode, For the predicted water accumulation depth, Q is a set value related to the preset water accumulation depth threshold, and Q is smaller than the preset water accumulation depth threshold; indicated as a comfort rewards, And T is a set temperature value as a weight coefficient.
6. The control method according to claim 3, wherein training the initial near-end policy optimization network, and updating the initial near-end policy optimization network by using the dominance function and the clipping loss function until a convergence condition is satisfied, and using the near-end policy optimization network at the time of convergence as the condensed water dynamic adjustment policy network, comprises: initializing a policy network, a value network and an experience buffer pool in the initial near-end policy optimization network; repeating the following steps until meeting the convergence condition; acquiring a current environment state, and inputting the current environment state into the strategy network, wherein the current environment state is acquired from a pre-constructed state space; Outputting actions through the policy network under the condition that the constraint condition is met; Executing the action through the strategy network, and determining a reward function value based on the acquired environmental state at the next moment, wherein the difference between the acquired time of the environmental state at the next moment and the current environmental state is the preset time interval; storing the current environmental state, the action, the rewarding function value and the environmental state at the next moment as an experience sample to the experience buffer pool; Extracting an experience sample set from the experience buffer pool according to a preset sampling mode; Determining the dominance function and the clipping loss function based on an empirical sample set, and determining a mean square error loss function of the value network based on the dominance function and the clipping loss function; Updating the policy network and the value network based on the mean square error loss function; after the network updating of the current training round is completed, judging whether a convergence condition is met, if so, finishing training, and taking a near-end strategy optimization network during convergence as the condensed water dynamic regulation strategy network.
7. The control method according to claim 6, characterized in that the convergence condition includes: under the condition that the number of continuous training rounds meets the preset number of rounds, the following requirements are met at the same time: The accumulated prize standard deviation is smaller than or equal to a first preset threshold, wherein the accumulated prize standard deviation is determined by prize function values acquired in each training round after continuously training for a preset number of rounds; The water accumulation depth standard reaching rate is larger than or equal to a second preset threshold value, wherein the water accumulation depth standard reaching rate is determined by water accumulation depth values obtained in each training round after the preset round number of continuous training; The temperature fluctuation exceeding rate is smaller than or equal to a third preset threshold, wherein the temperature fluctuation exceeding rate is determined by temperature values acquired in each training round after the preset number of continuous training rounds.
8. A control device of an air conditioner, comprising: The device comprises an acquisition unit, a control unit and a control unit, wherein the acquisition unit is used for acquiring environmental state information, the environmental state information comprises a temperature value, a humidity value and a door and window state value of the current environment of an air conditioner, and a water accumulation depth value of condensed water generated by the air conditioner; The control unit is used for inputting the environment state information into a condensate water dynamic regulation strategy network obtained through pre-training so that the condensate water dynamic regulation strategy network can determine a target working state of the air conditioner according to the environment state information, wherein the target working state is used for controlling the water accumulation depth of condensate water generated by the air conditioner to be within a preset water accumulation depth threshold value and controlling the temperature change of the current environment of the air conditioner to be within a preset range; The control unit is also used for controlling the air conditioner to work according to the target working state.
9. An air conditioner control device, comprising a processor and a memory, wherein the processor and the memory are connected with each other, and the memory stores machine-readable instructions executable by the processor, and the processor executes the machine-readable instructions to implement the air conditioner control method according to any one of claims 1 to 7.
10. An air conditioner according to claim 9, comprising a control device for the air conditioner.
11. A storage medium comprising a stored program, wherein the program, when run, controls a device in which the storage medium is located to perform the control method of the air conditioner of any one of claims 1 to 7.
12. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, realizes the steps of the control method of an air conditioner according to any one of claims 1 to 7.

Description

Air conditioner control method, device, air conditioner, storage medium and program product Technical Field The invention belongs to the technical field of intelligent control of air conditioners, and particularly relates to a control method and device of an air conditioner, the air conditioner, a storage medium and a computer program product. Background When the air conditioner operates in a high-humidity environment, the condensate water production is increased rapidly, and water is accumulated in a water receiving disc easily. In general, the ponding of the air conditioner water pan can be used as the spawning place of Aedes albopictus (chikungunya heat transmission medium), and the eggs can be hatched within 24 hours in 0.5cm ponding. In existing air conditioning systems, condensate water discharge control typically employs a control strategy based on a fixed threshold, for example, when indoor humidity exceeds 80%, typically by forcibly reducing compressor power and fan speed to reduce condensate water generation. Although the control mode can inhibit the accumulation of condensed water to a certain extent, the following technical problems exist: the fixed threshold is difficult to adapt to dynamically-changed environmental conditions, so that the accumulated water depth of the condensed water exceeds the mosquito egg breeding critical value, and the mosquito medium breeding risk can be increased. Therefore, how to promote intelligent control of condensed water becomes a problem to be solved. Disclosure of Invention The invention provides a control method and device of an air conditioner, the air conditioner, a storage medium and a computer program product, which avoid the problem of controlling condensed water through a fixed threshold in a related scheme, can automatically control the condensed water to be below a safety threshold, ensure that the temperature change is within a preset range, and improve the comfort level. The invention provides a control method of an air conditioner, which comprises the steps of obtaining environment state information, inputting the environment state information into a condensate water dynamic regulation strategy network trained in advance, so that the condensate water dynamic regulation strategy network determines a target working state of the air conditioner according to the environment state information, wherein the target working state is used for controlling the water accumulation depth of the condensate water generated by the air conditioner to be within a preset water accumulation depth threshold value, controlling the temperature change of the current environment of the air conditioner to be within a preset range, and controlling the air conditioner to work according to the target working state. In some embodiments, the target operating state of the air conditioner comprises the compressor power of the air conditioner and/or the fan rotating speed of the air conditioner; determining a target working state of the air conditioner, including: Determining a compressor power of the air conditioner, and/or determining a fan speed of the air conditioner. In some embodiments, the condensed dynamic adjustment policy network is trained on an initial near-end policy optimization network by: Determining an objective function, a constraint condition, a state space, an action space and a reward function of an initial near-end strategy optimization network, wherein the objective function is used for defining a target to be reached by training the initial near-end strategy optimization network, the constraint condition is used for constraining the temperature change of the current environment of the air conditioner to be within a preset range, the state space is used for defining an environment parameter set, the action space is an executable control action set, and the reward function is used for representing instant benefits obtained under a specified action; Training an initial near-end strategy optimization network, updating the initial near-end strategy optimization network through an advantage function and a clipping loss function until convergence conditions are met, and taking the near-end strategy optimization network during convergence as the condensed water dynamic regulation strategy network. In some embodiments, determining an objective function, constraint, of an initial near-end policy optimization network includes: Determining an objective function and constraint conditions of the initial near-end policy optimization network through the following formula, wherein: The objective function is: ; Wherein, the The desired value is indicated to be the desired value,As a discount factor, the number of times the discount is calculated,Is a reward function; the constraint conditions are as follows: ; Wherein, the A constraint mechanism for representing an action space for the constraint condition; a is used to represent an action and, For the purpose of indicating t