1. A computer program product for predicting ad spend for a specific media program aired or streamed on a specific network at a specific date and time using a database of media program data that includes (i) known ad spend for a subset of media programs, and (ii) viewership data for each of the media programs, including total viewership and viewership ratings, wherein each of the media programs is identified by its respective network, and date and time of airing or streaming, and wherein ad spend is an amount of money spent on advertising for a product or service, the computer program product comprising a computer readable medium tangibly embodying non-transitory computer-executable program instructions thereon that, when executed, cause one or more computing devices in a machine learning platform to:(a) perform data analysis on the media program data to identify one or more variables, or combinations of variables, that correlate with ad spend, the ad spend being the amount of money spent on advertising for a product or service;
(b) perform feature engineering on the identified one or more variables, or the combinations of variables to identify a subset of one or more variables, or combinations of variables, that provide the greatest explanatory value;
(c) train a random forest model to predict ad spend using the identified subset of one or more variables, or the combinations of variables, the random forest model including a random forest having a plurality of individual decision trees; and
(d) predict ad spend for a specific media program that is aired or streamed on a specific network at a specific date and time, and which has an unknown ad spend, using the trained random forest model, wherein the predicted ad spend is an average of ad spend predicted from the individual decision trees of the random forest,
wherein the total viewership is captured using unique IP addresses of devices that viewed respective media programs.