US Patent No. 10,596,698

MACHINE LEARNING DEVICE, ROBOT CONTROL SYSTEM, AND MACHINE LEARNING METHOD


Patent No. 10,596,698
Issue Date March 24, 2020
Title Machine Learning Device, Robot Control System, And Machine Learning Method
Inventorship Yuusuke Oota, Yamanashi (JP)
Fumikazu Warashina, Yamanashi (JP)
Hiromitsu Takahashi, Yamanashi (JP)
Assignee Fanuc Corporation, Yamanashi (JP)

Claim of US Patent No. 10,596,698

1. A machine learning device for use with, and configured to perform reinforcement learning with respect to a robot control system,the robot control system comprising:
an illumination means that irradiates a surface to be inspected of an object to be inspected with illumination light;
an imaging means that images the surface to be inspected;
a robot that includes a robot hand;
a control unit that, while moving the robot hand gripping the object to be inspected or the imaging means, along a movement route including a plurality of imaging points set on the surface to be inspected so that the surface to be inspected is entirely covered by a plurality of images imaged by the imaging means, causes the imaging means to image the imaging points set on the surface to be inspected; and
a flaw inspection unit that detects a flaw on the surface to be inspected on the basis of the image obtained by imaging the surface to be inspected by the imaging means,
the machine learning device comprising:
an action information output unit that outputs action information including adjustment information of the imaging region including the imaging points, to the control unit;
a state information acquisition unit that acquires state information from the control unit and the flaw inspection unit resulting from a number N of images, obtained by imaging the surface to be inspected by the imaging means by moving the robot hand gripping the object to be inspected or the imaging means by the control unit, based on the action information, the state information including the number N and flaw detection information, the flaw detection information including a flaw detection position of the surface to be inspected, detected by the flaw inspection unit, with respect to each of a plurality of objects to be inspected prepared in advance;
a reward output unit that outputs a reward value in the reinforcement learning based on the number N and the flaw detection information including the flaw detection positions included in the state information; and
a value function updating unit that updates an action value function based on the reward value, the state information, and the action information.