EP-4437503-B1 - CALIBRATING A CAMERA FOR MAPPING IMAGE PIXELS TO GRID POINTS IN A STORAGE SYSTEM

EP4437503B1EP 4437503 B1EP4437503 B1EP 4437503B1EP-4437503-B1

Inventors

CUI, Xiaokai
PEARMAN, Christopher
SCHUCHART, Jonathan

Dates

Publication Date: 20260513
Application Date: 20221125

Claims (15)

A computer-implemented method of calibrating a wide-angle or ultra wide-angle camera disposed above a grid of a storage system, wherein the grid is formed by a first set of parallel tracks extending in an X-direction and a second set of parallel tracks extending in a Y-direction transverse to the first set of parallel tracks in a substantially horizontal plane to define a plurality of grid spaces having a known aspect ratio, the first and second sets of parallel tracks are arranged to support one or more transport devices, selectively movable in at least one of the X-direction or Y-direction on the tracks, for handling a container stacked beneath the tracks within a footprint of a single grid space, the method comprising: obtaining an image of a grid section of the grid captured by the camera; obtaining initial values of a plurality of parameters corresponding to the camera, the plurality of parameters including a focal length of the camera, a translational vector representative of a position of the camera above the grid section, and a rotational vector representative of a tilt and rotation of the camera; processing the image, using a neural network trained to detect/predict the first and second sets of parallel tracks in images of grid sections captured by wide-angle or ultra wide-angle cameras, to generate a model of the first and second sets of parallel tracks as captured in the image of the grid section; mapping selected pixels in the model to corresponding points on the grid using a mapping based on the plurality of parameters, wherein the initial values are used as inputs to the mapping; determining an error function based on a discrepancy between grid coordinates of the points determined by the mapping and known grid coordinates; updating the initial values to updated values of the plurality of parameters based on the error function; and storing the updated values of the plurality of parameters for mapping pixels in images of the grid section captured by the camera to corresponding points on the grid of the storage system via the mapping.
The method of claim 1, comprising iteratively determining the error function and further updating the updated values of the plurality of parameters until the error function is reduced by less than a predetermined threshold.
The method of claim 1 or 2, wherein respective boundary values for the plurality of parameters are applied when updating respective values of the plurality of parameters.
The method of any preceding claim, comprising: processing the image, using an object detection model trained to detect instances of grid cell markers in images of grid sections, to detect a grid cell marker located in a grid cell of the grid section in the image; extracting grid cell coordinate data encoded in the grid cell marker; and calibrating grid coordinates, generated by the mapping of pixels in the image to points on the grid, based on the extracted cell coordinate data.
The method of any preceding claim, comprising refining the model of the grid section to represent only centrelines of the first and second sets of parallel tracks, the refining comprising filtering the model with horizontal and vertical line detection kernels, optionally the filtering comprises at least one of eroding and dilating pixel values of the model using the horizontal and vertical line detection kernels.
The method of claim 5, comprising: fitting the centrelines to respective quadratic trajectories to produce quadratic centrelines; and extracting a random subset of pixels from the quadratic centrelines to generate the model.
The method of any preceding claim, wherein the mapping comprises, for a given pixel in the image and corresponding point on the grid: determining second cartesian coordinates of the corresponding point based on image coordinates of the pixel in the image; converting the second cartesian coordinates into second polar coordinates; applying an inverse distortion model to the second polar coordinates to generate first polar coordinates; converting the first polar coordinates into first cartesian coordinates of the point; dealigning the point from the cartesian coordinate system in a second plane corresponding to the camera; projecting the point onto a first plane corresponding to the grid from the second plane to determine grid coordinates of the point relative to the grid.
The method of claim 7, wherein the inverse distortion model is based on a tangent model of distortion given by: r = f ⋅ tan r ′ / f ; wherein r' is the distorted radial coordinate of the point, r is the undistorted radial coordinate of the point, and f is the focal length of the camera.
The method of claim 7 or 8, wherein determining the second cartesian coordinates comprises initialising the pixel in the image, including at least one of centering or normalising the image coordinates and/or wherein determining the second cartesian coordinates comprises inverting the ordinate of the image coordinates.
The method of any one of claims 7 to 9, wherein dealigning the point from the cartesian coordinate system comprises applying a rotational transformation to the point to align the point with an orientation of the grid in the image.
The method of any one of claims 7 to 10, wherein projecting the point onto the first plane corresponding to the grid comprises computing: p = B − 1 ⋅ f ⋅ t − q ⋅ z ; wherein: B = q ⋅ R 3 , 1 2 − f ⋅ R 1 2 , 1 2 ; wherein p comprises point coordinates in the first plane corresponding to the grid, f is the focal length of the camera, t is a planar translation vector, q comprises the first cartesian coordinates in the second plane, z is a distance between the camera and the grid, and R is a three-dimensional rotation matrix related to a rotation vector.
A calibration system for calibrating a wide-angle or ultra wide-angle camera disposed above a grid of a storage system, wherein the grid is formed by a first set of parallel tracks extending in an X-direction and a second set of parallel tracks extending in a Y-direction transverse to the first set of parallel tracks in a substantially horizontal plane to define a plurality of grid spaces having a known aspect ratio, the first and second sets of parallel tracks are arranged to support one or more transport devices, selectively movable in at least one of the X-direction or Y-direction on the tracks, for handling a container stacked beneath the tracks within a footprint of a single grid space, the calibration system comprising: at least one interface to obtain an image of a grid section of the grid captured by the camera, and obtain initial values of a plurality of parameters corresponding to the camera, the plurality of parameters including a focal length of the camera, a translational vector representative of a position of the camera above the grid section, and a rotational vector representative of a tilt and rotation of the camera; and at least one processor configured to: process the image, using a neural network trained to detect/predict the first and second sets of parallel tracks in images of grid sections captured by a wide-angle or ultra wide-angle cameras, to generate a model of the first and second sets of parallel tracks as captured in the image of the grid section; map pixels in the model to corresponding points on the grid using a mapping based on the plurality of parameters, wherein the initial values are used as inputs to the mapping; determine an error function based on a discrepancy between grid coordinates of the points determined by the mapping and known grid coordinates; update the initial values to updated values of the plurality of parameters based on the error function; and output the updated values of the plurality of parameters to storage for mapping pixels in images of the grid section captured by the camera to corresponding points on the grid of the storage system via the mapping.
The calibration system of claim 12, wherein the at least one processor is configured to: process the image, using an object detection model trained to detect instances of grid cell markers in images of grid sections, to detect a grid cell marker located in a grid cell of the grid section in the image; extract grid cell coordinate data encoded in the grid cell marker; and calibrate grid coordinates, generated by the mapping of pixels in the image to points on the grid, based on the extracted cell coordinate data, optionally the grid cell marker comprises a signboard with the cell coordinate data marked on the signboard.
The calibration system of claim 12 or 13, wherein the mapping comprises, for a given pixel in the image and corresponding point on the grid: determining second cartesian coordinates of the corresponding point based on image coordinates of the pixel in the image; converting the second cartesian coordinates into second polar coordinates; applying an inverse distortion model to the second polar coordinates to generate first polar coordinates; converting the first polar coordinates into first cartesian coordinates of the point; dealigning the point from the cartesian coordinate system in a second plane corresponding to the camera; projecting the point onto a first plane corresponding to the grid from the second plane to determine grid coordinates of the point relative to the grid.
The calibration system of claim 14, wherein the inverse distortion model is based on a tangent model of distortion given by: r = f ⋅ tan r ' / f ; wherein r' is the distorted radial coordinate of the point, r is the undistorted radial coordinate of the point, and f is the focal length of the camera.

Description

Technical Field The present disclosure generally relates to the field of a storage or fulfilment system in which stacks of bins or containers are arranged within a grid framework structure, and more specifically, to calibrating a wide-angle or ultra wide-angle camera disposed above the grid framework structure. Background Online retail businesses selling multiple product lines, such as online grocers and supermarkets, require systems that can store tens or hundreds of thousands of different product lines. The use of single-product stacks in such cases can be impractical since a vast floor area would be required to accommodate all of the stacks required. Furthermore, it can be desirable to store small quantities of some items, such as perishables or infrequently ordered goods, making single-product stacks an inefficient solution. International patent application WO 98/049076A (Autostore), describes a system in which multi-product stacks of containers are arranged within a frame structure. PCT Publication No. WO2015/185628A (Ocado) describes a further known storage and fulfilment system in which stacks of containers are arranged within a grid framework structure. The containers are accessed by one or more load handling devices, otherwise known as "bots", operative on tracks located on the top of the grid framework structure. A system of this type is illustrated schematically in Figures 1 to 3 of the accompanying drawings. As shown in Figures 1 and 2, stackable containers 10, also known as "bins", are stacked on top of one another to form stacks 12. The stacks 12 are arranged in a grid framework structure 14, e.g. in a warehousing or manufacturing environment. The grid framework structure 14 is made up of a plurality of storage columns or grid columns. Each grid in the grid framework structure has at least one grid column to store a stack of containers. Figure 1 is a schematic perspective view of the grid framework structure 14, and Figure 2 is a schematic top-down view showing a stack 12 of bins 10 arranged within the framework structure 14. Each bin 10 typically holds a plurality of product items (not shown). The product items within a bin 10 may be identical or different product types depending on the application. The grid framework structure 14 comprises a plurality of upright members 16 that support horizontal members 18, 20. A first set of parallel horizontal grid members 18 is arranged perpendicularly to a second set of parallel horizontal members 20 in a grid pattern to form a horizontal grid structure 15 supported by the upright members 16. The members 16, 18, 20 are typically manufactured from metal. The bins 10 are stacked between the members 16, 18, 20 of the grid framework structure 14, so that the grid framework structure 14 guards against horizontal movement of the stacks 12 of bins 10 and guides the vertical movement of the bins 10. The top level of the grid framework structure 14 comprises a grid or grid structure 15, including rails 22 arranged in a grid pattern across the top of the stacks 12. Referring to Figure 3, the rails or tracks 22 guide a plurality of load handling devices 30. A first set 22a of parallel rails 22 guides movement of the robotic load handling devices 30 in a first direction (e.g. an X-direction) across the top of the grid framework structure 14. A second set 22b of parallel rails 22, arranged perpendicular to the first set 22a, guides movement of the load handling devices 30 in a second direction (e.g. a Y-direction), perpendicular to the first direction. In this way, the rails 22 allow the robotic load handling devices 30 to move laterally in two dimensions in the horizontal X-Y plane. A load handling device 30 can be moved into position above any of the stacks 12. A known form of load handling device 30 - shown in Figures 4, 5, 6A and 6B - is described in PCT Patent Publication No. WO2015/019055 (Ocado), where each load handling device 30 covers a single grid space 17 of the grid framework structure 14. This arrangement allows a higher density of load handlers and thus a higher throughput for a given sized storage system. The example load handling device 30 comprises a vehicle 32, which is arranged to travel on the rails 22 of the frame structure 14. A first set of wheels 34, consisting of a pair of wheels 34 at the front of the vehicle 32 and a pair of wheels 34 at the back of the vehicle 32, is arranged to engage with two adjacent rails of the first set 22a of rails 22. Similarly, a second set of wheels 36, consisting of a pair of wheels 36 at each side of the vehicle 32, is arranged to engage with two adjacent rails of the second set 22b of rails 22. Each set of wheels 34, 36 can be lifted and lowered so that either the first set of wheels 34 or the second set of wheels 36 is engaged with the respective set of rails 22a, 22b at any one time during movement of the load handling device 30. For example, when the first set of wheels 34 is engaged with the first set of