US Pat. No. 9,836,873

ANALYSIS AND MANIPULATION OF PANORAMIC SURROUND VIEWS

FYUSION, INC., San Franc...

1. A method comprising:
obtaining a first surround view, wherein the first surround view includes a panoramic multi-view interactive digital media
representation of an object, wherein the first surround view is an object panorama;

obtaining a second surround view, wherein the second surround view includes a panoramic view of a distant scene; and
generating a third surround view including the first surround view placed in a foreground position relative to the second
surround view, wherein the first and second surround views are obtained using different capture motions such that a more complex
surround view corresponding to a more complex capture motion can be generated using separate smaller surround views corresponding
to separate smaller capture motions.

US Pat. No. 10,169,911

ANALYSIS AND MANIPULATION OF PANORAMIC SURROUND VIEWS

FYUSION, INC., San Franc...

1. A method comprising:obtaining a first surround view, wherein the first surround view includes a panoramic multi-view interactive digital media representation of an object, wherein the first surround view is an object panorama;
obtaining a second surround view, wherein the second surround view includes a panoramic view of a distant scene; and
generating a third surround view including the first surround view placed in a foreground position relative to the second surround view, wherein the first and second surround views are obtained using different capture motions, wherein the third surround view is generated using IMU data, keypoint tracks, and view interpolation in order to reduce the amount of data that must be transferred during upload or download of the third surround view.

US Pat. No. 9,940,541

ARTIFICIALLY RENDERING IMAGES USING INTERPOLATION OF TRACKED CONTROL POINTS

FYUSION, INC., San Franc...

1. A method comprising:tracking a set of control points between a first frame and a second frame, wherein the first frame includes a first image captured from a first location and the second frame includes a second image captured from a second location, the first and second locations corresponding to real world location positions; and
generating an artificially rendered image as a third frame corresponding to a third location, the third location being a real world location position on a trajectory between the first location and the second location, wherein generating the artificially rendered image includes:
interpolating a transformation using at least one of homography, affine, similarity, translation, rotation, and scale, including interpolating individual control points for the third location using IMU data and the set of control points, the IMU data corresponding to the first and second locations, and interpolating pixel locations using the individual control points, wherein the individual control points are used to transform image data, wherein interpolating the transformation includes using depth information to reduce occurrence of artifacts resulting from mismatched pixels;
gathering weighted image information by transferring first image information from the first frame to the third frame based on the interpolated transformation and transferring second image information from the second frame to the third, wherein the image information is weighted by 1-x for the first image information and x for the second image information; and
combining the first image information and the second image information to form the artificially rendered image.

US Pat. No. 10,026,219

ANALYSIS AND MANIPULATION OF PANORAMIC SURROUND VIEWS

FYUSION, INC., San Franc...

1. A method comprising:generating a combined surround view including a first surround view placed in a foreground position relative to a second surround view, wherein the first and second surround views are obtained using different capture motions, and wherein non-linear surround views are separated into discrete components and complex motions are broken down into locally convex and linear portions such that a more complex surround view corresponding to a more complex capture motion can be generated using separate smaller surround views corresponding to separate smaller capture motions.

US Pat. No. 9,996,945

LIVE AUGMENTED REALITY GUIDES

FYUSION, INC., San Franc...

1. A method comprising:receiving live images from a camera on a mobile device comprising 2-D pixel data wherein the live images are output to a display on the mobile device and show what is currently being captured by the camera;
capturing a plurality of images from among the live images to generate a multi-view interactive digital media representation of an object appearing in the live images wherein a position and/or orientation of the camera varies during the capturing of the plurality of images;
while the live images are currently being output to the display on the mobile device, receiving, via a touchscreen over the display, a selection of a location on the touchscreen wherein the location is over the object;
based upon the location on the touchscreen, determining a first position of a tracking point in the 2-D pixel data from a first image among the live images;
receiving sensor data indicating the position and/or orientation of the camera;
as additional live images after the first image are captured by the camera and output to the display, determining a current position of the tracking point in each of the additional live images on an image by image basis;
based upon the current position of the tracking point and the sensor data, rendering a virtual object, separate from and addition to the object, into each of the additional live images output to the display to generate a plurality of synthetic images wherein in each of the synthetic images the virtual object is positioned relative to the current position of the tracking point determined for each of the additional live images, wherein the virtual object provides information related to i) a progress of the camera along a path, ii) a current position and/or current orientation of the camera relative to the path during the capture of the plurality of images used to generate the multi-view interactive digital media representation of the object; and
outputting the plurality of synthetic images to the display wherein each of the synthetic images show the object as currently being captured by the camera and the virtual object.

US Pat. No. 10,070,154

CLIENT-SERVER COMMUNICATION FOR LIVE FILTERING IN A CAMERA VIEW

Fyusion, Inc., San Franc...

1. A method comprising:automatically transmitting, using an electronic client device, a first video frame in a raw video stream from the electronic client device to a server via a communication network in response to determining the first video frame meets a designated criterion, the raw video stream being captured live by the electronic client device;
receiving, by the electronic client device, from the server a filter processing message associated with the first video frame, the filter processing message including filter data for applying a filter to the first video frame;
creating a filtered video stream in real time via a processor at the electronic client device by applying the filter to a second video frame, wherein the first video frame precedes the second video frame in the raw video stream, wherein the filter is applied to the second video frame by propagating information from the first video frame to the second video frame based on the filter data, wherein the second video frame is neither transmitted to nor received from the server; and
presenting the filtered video stream live at the electronic client device;
wherein applying the filter to the second video frame comprises:
identifying a first one or more image features in the first video frame via the processor at the electronic client device;
propagating the first one or more image features to the second video frame;
identifying a second one or more image features in the second video frame via the processor at the electronic client device; and
identifying a correspondence between the first one or more image features and the second one or more image features.

US Pat. No. 10,210,662

LIVE AUGMENTED REALITY USING TRACKING

Fyusion, Inc., San Franc...

1. A method comprising:receiving a request to capture a plurality of images used to generate a multi-view interactive digital media representation of a real object appearing in the plurality of images;
receiving first live images, including the real object, captured from a camera on a mobile device wherein the live images are output to a display on the mobile device and show what is currently being captured by the camera and wherein the live images comprise first 2-D pixel data;
receiving first sensor data indicating a first orientation of the camera associated with a first image among the first live images;
generating a first synthetic image comprising 1) a location selector rendered into the first 2-D pixel data associated with the first image wherein the location selector is a movable first virtual object in the first synthetic image that when selected causes a pixel location in the first image to be selected and 2) the real object;
receiving, via a touch screen over the display and via the location selector, a selection of the pixel location in the first 2-D pixel data from the first image;
determining a first pixel location in the first 2-D pixel data from the first image of a first tracking point wherein the first tracking point is within the first 2-D pixel data associated with the real object and is proximate to the pixel location selected via the location selector;
generating a second synthetic image comprising 1) a second virtual object rendered into the first 2-D pixel data from the first image wherein the second virtual object is positioned in the first 2-D pixel data from the first live image relative to first pixel location of the first tracking point;
outputting the second synthetic image to the display;
receiving second live images captured by the camera after the first image is captured including second 2-D pixel data wherein the second live images include the real object from a plurality of different views;
receiving second sensor data, associated with the second live images, indicating second orientations of the camera on mobile device associated with the plurality of different views;
receiving second live image data including second 2-D pixel data from the camera;
based upon the first sensor data, the second sensor data, the first 2-D pixel data and the second 2-D pixel data, determining, as the view of the real object in the second live images changes, second pixel locations of the first tracking point in the second 2-D pixel data of the second live images on an image by image basis wherein the second pixel locations are determined using one of spatial intensity information or optical flows derived from the second 2-D pixel data;
generating a third synthetic images including the second virtual object rendered into the second 2-D pixel data at third pixel locations positioned relative to the second pixel locations of the first tracking point;
outputting the third synthetic images to the display wherein each of the third synthetic images shows one of the different views of the real object as currently being captured by the camera and the second virtual object.

US Pat. No. 10,068,316

TILTS AS A MEASURE OF USER ENGAGEMENT FOR MULTIVIEW DIGITAL MEDIA REPRESENTATIONS

Fyusion, Inc., San Franc...

1. A method comprising:outputting to a display a first 2-D image rendered from a 3-D model of a 3-D object in a 3-D model space wherein points defining the 3-D model are at first 3-D locations in the 3-D model space when the first 2-D image is rendered;
after the first 2-D image is output to the display, receiving a sequence of navigational inputs from at least one input source wherein the sequence of navigational inputs begins with a first navigational input;
selecting a second navigational input from among the sequence of navigational inputs;
based upon at least the second navigational input, determining second 3-D locations of the points defining the 3-D model in the 3-D model space;
based upon the second 3-D locations, outputting to the display a second 2-D image rendered from the 3-D model in the 3-D model space;
based upon one of i) a first change between the first navigational input and the second navigational input, ii) a second change between the first 3-D locations and the second 3-D locations or iii) combinations thereof, determining whether to increment a count;
incrementing the count;
after the count is incremented, determining the count exceeds a threshold value;
in response to the count exceeding the threshold value, unlocking a second 3-D model of a second 3-D object;
determining first 3-D locations of points defining the second 3-D model of the second 3-D object in the 3-D model space; and
based upon the first 3-D locations of the points defining the second 3-D model, outputting to the display a third 2-D image rendered from the second 3-D model in the 3-D model space.

US Pat. No. 10,200,677

INERTIAL MEASUREMENT UNIT PROGRESS ESTIMATION

Fyusion, Inc., San Franc...

1. A method comprising:on a mobile device including a processor, a memory, a camera, an inertial measurement unit, a microphone, a GPS sensor and a touchscreen display, receiving a request to generate a multi-view interactive digital media representation of an object;
receiving a sequence live images from the camera on the mobile device, wherein the live images include 2-D pixel data, wherein the camera moves along a path and wherein an orientation of the camera varies along the path such that the object in the sequence of the live images is captured from a plurality of camera views;
based upon sensor data from the inertial measurement unit, determining angular changes in the orientation of the camera along the path;
based upon the angular changes, determining an angular view of the object captured in the sequence of the live images; and
generating from the sequence of the live images the multi-view interactive digital media representation wherein the multi-view interactive digital media representation includes a plurality of images wherein each of the plurality of images includes the object from a different camera view such that when the plurality of images is output to the touchscreen display the object appears to undergo a 3-D rotation through the angular view wherein the 3-D rotation of the object is generated without a 3-D polygon model of the object; and
outputting a value of the angular view of the object captured in the multi-view interactive digital media representation.

US Pat. No. 10,176,592

MULTI-DIRECTIONAL STRUCTURED IMAGE ARRAY CAPTURE ON A 2D GRAPH

Fyusion, Inc., San Franc...

1. A method for capturing an unstructured light field in a plurality of images, the method including:identifying a plurality of keypoints on a first keyframe in a plurality of captured images;
computing the convex hull of all keypoints in the plurality of keypoints in the first keyframe to form a first convex hull;
merging the first convex hull with previous convex hulls corresponding to previous keyframes to form a convex hull union;
keeping track of each keypoint from the first keyframe to a second image;
adjusting the second image to compensate for camera rotation during capture of the second image;
computing the convex hull of all keypoints in the second image to form a second convex hull;
if the overlapping region between the second convex hull and the convex hull union is equal to, or less than, half of the size of the second convex hull, designating the second image as a new keyframe; and
if the second image is designated as a new keyframe, augmenting the convex hull union with the second convex hull.

US Pat. No. 10,152,825

AUGMENTING MULTI-VIEW IMAGE DATA WITH SYNTHETIC OBJECTS USING IMU AND IMAGE DATA

Fyusion, Inc., San Franc...

1. A method comprising:receiving a selection of an anchor location for a synthetic object to be placed within a multi-view image, the multi-view image captured with a camera having intrinsic parameters, wherein the anchor location is selected as a point from a reference view associated with a reference image, the reference view corresponding to one view of the multi-view image;
computing movements between a reference image and a target image using visual tracking information associated with the multi-view image, device orientation corresponding to the multi-view image, and an estimate of the camera's intrinsic parameters, wherein the camera's intrinsic parameters includes at least an approximate estimate of a focal length;
generating a first synthetic image corresponding to a target view associated with the target image, wherein the first synthetic image is generated by placing the synthetic object at the anchor location using visual tracking information associated with the anchor location in the multi-view image, orienting the synthetic object using the inverse of the movements computed between the reference image and the target image, and projecting the synthetic object along a ray into the target view, wherein the anchor location includes three-dimensional coordinates corresponding to 2D coordinates specified in the reference image along with a depth perpendicular to the plane of the reference image,
the depth being triangulated, wherein generating the first synthetic image includes scaling the triangulated depth based on scale changes in the multi-view image; and
overlaying the first synthetic image on the target image to generate an augmented image from the target view.

US Pat. No. 10,147,211

ARTIFICIALLY RENDERING IMAGES USING VIEWPOINT INTERPOLATION AND EXTRAPOLATION

Fyusion, Inc., San Franc...

1. A method comprising:moving a mobile device with a camera through space in a locally convex or locally concave motion;
obtaining, from the camera throughout the movement of the mobile device, a plurality of frames having location information obtained from Inertial Measuring Unit (IMU) information and depth information, wherein the plurality of frames include a first frame and a second frame;
moving a set of control points perpendicular to a trajectory between the first frame and the second frame, wherein the first frame includes a first image captured using the camera from a first location and the second frame includes a second image captured using the camera from a second location, wherein each control point is moved based on an associated depth of the control point, the depth of each of the control points being determined from the amount of frame-to-frame motion of the control points between the first frame and the second frame, wherein control points located at a further depth are moved less than control points located at a closer depth, wherein the set of control points is associated with a single layer among multiple layers;
generating an artificially rendered image corresponding to a third location outside of the trajectory by extrapolating individual control points using the set of control points for the third location and extrapolating pixel locations using the individual control points, wherein generating the artificially rendered image occurs on the fly based on content-weighted keypoint tracks between the first image and the second image, the IMU information, and the depth information; and
generating a surround view by fusing the plurality of frames along with the artificially rendered image, wherein the surround view is a three-dimensional multi-view interactive digital media representation (MIDMR) created from two-dimensional images in the plurality of frames and the artificially rendered image.

US Pat. No. 10,242,474

ARTIFICIALLY RENDERING IMAGES USING VIEWPOINT INTERPOLATION AND EXTRAPOLATION

Fyusion, Inc., San Franc...

1. A method comprising:moving a mobile device with a camera through space in a locally convex or locally concave motion;
obtaining, from the camera throughout the movement of the mobile device, a plurality of frames having location information, wherein the plurality of frames include a first frame and a second frame;
applying a transform to estimate a path outside the trajectory between the first frame and the second frame, wherein the first frame includes a first image captured by the camera from a first location and the second frame includes a second image captured by the camera from a second location;
generating an artificially rendered image corresponding to a third location, wherein the third location is positioned on the path, the artificially rendered image generated by:
interpolating a transformation from the first location to the third location and from the third location to the second location;
gathering image information from the first frame and the second frame by transferring first image information from the first frame to the third frame based on the interpolated transformation and second image information from the second frame to the third frame based on the interpolated transformation; and
combining the first image information and the second image information;
determining whether an occlusion is present due to movement of multiple layers; and
if an occlusion is detected, determining whether the layer closer to the camera is non-see-through or partially see-through, wherein if the layer closer to the camera is non-see-through, only image information from the layer closer to the camera is taken, wherein if the layer closer to the camera is partially see-through, then image information is taken from both layers,
wherein generating the artificially rendered image includes filling in missing information using viewpoint extrapolation by moving different layers of the multiple layers in a motion perpendicular to the trajectory.

US Pat. No. 10,237,477

LOOP CLOSURE

Fyusion, Inc., San Franc...

1. A method comprising:on a mobile device including a processor, a memory, a camera, an inertial measurement unit, a microphone and a touchscreen display, receiving via an input interface on the mobile device a request to generate a multi-view interactive digital media representation of an object;
receiving live images from the camera on the mobile device as the mobile device moves along a path and wherein an orientation of the camera varies along the path such that the object in the live images is captured from a plurality of camera views;
based upon sensor data from the inertial measurement unit, determining angular changes in the orientation of the camera along the path;
based upon the angular changes, determining an angular view of the object captured in each of the live images;
based upon the determined angular view of the object in each of the live images, selecting a sequence of images from among the live images;
determining the angular view of the object is about three hundred sixty degrees in one of the live images;
selecting a final image in the sequence of images wherein the angular view of the object in the final image is about three hundred sixty degrees; and
generating from the sequence of the images the multi-view interactive digital media representation wherein the multi-view interactive digital media representation includes a plurality of images wherein each of the plurality of images includes the object from a different camera view such that when the plurality of images is output to the touchscreen display the object appears to undergo a 3-D rotation through about three hundred sixty degrees view wherein the three hundred sixty degree 3-D rotation of the object is generated without a 3-D polygon model of the object.