US Patent No. 10,923,213

LATENT SPACE HARMONIZATION FOR PREDICTIVE MODELING


Patent No. 10,923,213
Issue Date February 16, 2021
Title Latent Space Harmonization For Predictive Modeling
Inventorship Nicolo Fusi, Boston, MA (US)
Jennifer Listgarten, Cambridge, MA (US)
Gregory Byer Darnell, Princeton, NJ (US)
Assignee Microsoft Technology Licensing, LLC, Redmond, WA (US)

Claim of US Patent No. 10,923,213

1. A computing system to train a predictive model that models a physical principle or phenomenon using two or more training data sets including respective incomparable units of measurement, the system comprising:memory configured to maintain different measurement groups each including training data sets of input training data and output training data;
a processor system configured to:
receive the training data sets of the input training data and the output training data from two or more of the measurement groups, the output data of the two or more measurement groups including supervised target variables of the physical principle or phenomenon;
generate respective monotonic mapping functions for each of the two or more measurement groups to map output training data for each of the two or more measurement groups to a shared latent space by inferring monotonic relationships between the output training data for each of the two or more measurement groups and the shared latent space based on the input training data and the supervised target variables;
generate, using the monotonic mapping function for each of the two or more measurement groups, a combined training data set that maps the input training data from the two or more measurement groups to corresponding output training data in the shared latent space, such that the combined training data set includes a larger number of data samples than any of the training data sets from the two or more measurement groups, by transforming output training data from the two or more of the measurement groups to the output training data in the shared latent space; and
train a model that predicts outputs for new input data using machine learning and the combined training data set.