US Pat. No. 9,275,114

APPARATUS AND METHOD FOR PROFILING USERS

PlaceIQ, Inc., New York,...

1. A method of profiling a user of a mobile computing device, the method comprising:
obtaining a location history of a user, the location history being based on signals from a mobile computing device of the
user;

obtaining a location-attribute score of a location identified in, or inferred from, the location history;
determining, with a computer, a user-attribute score based on the location-attribute score; and
storing the user-attribute score in a user-profile datastore, wherein:
the location history comprises an list of geolocation records, each geolocation record including geographic coordinates expressed
as a latitude and longitude and a time at which the mobile computing device was at the respective coordinates, each geolocation
record being obtained by an end-user portable device having access to a location identifying service;

obtaining a location-attribute score comprises:
inferring locations between locations identified in the location history;
for each identified or inferred location, retrieving respective tile records from a GIS, the tile records corresponding to
a tile in which the respective location is disposed and adjacent tiles, each tile record corresponding to a geographic area
of between 100 square meters and 100,000 square meters and being associated with one or more location-attribute scores, each
location-attribute score corresponding to an activity of interest to advertisers and ordinal values indicative of a likelihood
that a user is engaged in the respective activity in the tile during each of a plurality of time-bins, the time-bins defining
different subsets of a week;

determining a user-attribute score comprises:
determining that location attribute scores for the tile records for the time-bin in which the user was at the location are
consistent among the adjacent tiles; and

in response, determining a plurality of user-attribute scores corresponding the location-attribute scores, the respective
user-attribute score being an average of the corresponding location-attribute score for a time-bin including the time at which
the user was a the location and previous scores for the attribute from other locations;

storing the user-attribute score comprises:
storing the averaged user-attribute scores in a user profile in the user-profile datastore, user profile being stored on a
tangible, non-transitory, machine-readable medium, and the user-profile datastore being operative to respond to queries from
advertisers for data relevant to the selection of advertisements;

receiving a query for data relevant to the selection of advertisements; and
responding to the query based on responsive data stored in the user-profile datastore.

US Pat. No. 10,089,367

EXPEDITING PATTERN MATCHING QUERIES AGAINST TIME SERIES DATA

PlaceIQ, Inc., New York,...

1. A method, comprising:obtaining activity profiles of more than 10,000 individuals, each activity profile corresponding to a respective individual, and each activity profile including a plurality of activity records, each activity record indicating a geolocation, a timestamp indicating when the individual was at the geolocation, and an attribute of the geolocation other than geospatial or temporal attributes, the activity profiles being based, at least in part, on network traffic generated by the individuals with respective mobile computing devices;
for each activity profile, sorting the activity records in order of the timestamps to form respective sorted activity profiles;
obtaining a query having a rule specifying criteria to select a subset of the individuals, the criteria comprising:
an activity pattern comprising an activity, an amount of instances of the activity, a first relational operator, and a pattern duration of time over which the activity pattern is evaluated to determine whether the amount of instances of the activity satisfy a first condition specified by the first relational operator; and
a quantifier comprising an amount of instances of the activity pattern, a second relational operator, and a quantifier duration of time over which the quantifier is evaluated to determine whether the amount of instances of the activity pattern satisfies a second condition specified by the second relational operator;
for each sorted activity profile, with one or more processors:
initializing an activity pattern count;
initializing a quantifier count;
iterating through the sorted activity records in sorted order and at each iteration:
determining whether the attribute of the geolocation of the respective activity record matches the activity of the activity pattern and, in response to determining a match:
determining the activity pattern count;
determining whether the activity pattern count satisfies the first condition and, in response to determining that the first condition is satisfied:
 initializing the activity pattern count;
 determining the quantifier count; and
 determining whether the quantifier count satisfies the second condition and, in response to determining that the second condition is satisfied, designating the individual corresponding to the respective sorted activity profile as responsive to the query.

US Pat. No. 9,483,498

APPARATUS AND METHOD FOR PROFILING USERS

PlaceIQ, Inc., New York,...

1. A datacenter configured to expedite generation of user profiles based on time-dependent attributes of geolocations sensed
by mobile computing devices, the datacenter comprising:
a primary computing device having one or more processors and storing an instance of an operating system;
a local area network; and
a plurality of secondary computing devices communicatively coupled with the master computing device via the local area network,
each secondary computing device having one or more processors and storing an instance of an operating system, wherein the
primary computing device and the plurality of secondary computing devices store instructions that when executed by the primary
computing device and the plurality of secondary computing devices effectuate operations comprising:

obtaining, in memory, location histories of a plurality of users, the location histories including geolocations of corresponding
mobile computing devices and times at which the mobile computing devices were at the geolocations;

querying, with one or more processors, a geographic information system (GIS), with the geolocations and times, for time-dependent
attribute scores of places the location histories indicate at least some of the users visited, wherein the GIS associates
each of the places with a plurality of different durations of time and each of the durations of time with attribute scores
for more than 100 different attributes;

generating, with one or more processors, user profiles of the plurality of users based on time-dependent attribute scores
responsive to the query, wherein the user profiles each include a plurality of profile-attribute scores based on the time-dependent
attribute scores responsive to the query for places visited by a corresponding user, wherein generating user profiles comprises:

assigning, with the primary computing device, different profiling tasks to each of a plurality of different secondary computing
devices; and

performing the profiling tasks by determining, with the secondary computing devices, at least some of the plurality of profile-attribute
scores; and

storing the generated user profiles in memory.

US Pat. No. 9,589,280

MATCHING ANONYMIZED USER IDENTIFIERS ACROSS DIFFERENTLY ANONYMIZED DATA SETS

PlaceIQ, Inc., New York,...

1. A method of generating user profiles, the method comprising:
obtaining a plurality of location data sets from different providers of user geolocation history, each location data set including
a plurality of user-activity records, each user-activity records being associated with a user identifier and including geolocations
of the corresponding user and times that the corresponding user was at the geolocations, the different providers having different
user identifiers for a given corresponding user;

matching, by one or more processors, the user identifiers between the location data sets based on geolocations of the corresponding
user and times that the corresponding user was at the geolocations; and

storing the matched user identifiers in association with one another in corresponding user profiles, wherein:
the geolocations are sensed based on wireless signals received by or transmitted from mobile user devices;
the location data sets are anonymized such that no canonical user identifier is present for the corresponding users across
the plurality of location data sets;

at least some of the location data sets are obtained with a batch process such that the at least some of the location data
sets contain data collected over at least an hour;

the user identifiers uniquely identify each corresponding user within a corresponding location data set; and
the location data sets comprise:
a first location data set having a plurality of user location histories from a cell carrier indicative of geolocations of
cellular phones while on a cellular network of the cell carrier; and

a second location data set having geolocations obtained by native applications executing on at least some of the cellular
phones, the native applications being provided by an entity different from the cell carrier, at least some of the native-application-obtained
geolocations being obtained by a LocationProvider class of a first object oriented program executing on at least some of the
cellular phones and at least some of the native-application-obtained geolocations being obtained by a CLLocationManager class
of a second object oriented program executing on at least some of the other cellular phones.

US Pat. No. 9,589,048

GEOLOCATION DATA ANALYTICS ON MULTI-GROUP POPULATIONS OF USER COMPUTING DEVICES

PlaceIQ, Inc., New York,...

1. A method comprising:
obtaining, with one or more computer processors, one or more network traffic logs documenting communications via a network
between one or more servers and more than 10,000 user computing devices, wherein at least some of the communications are associated
with a respective timestamp and a device identifier of a respective user computing device among the more than 10,000 user
computing devices;

defining for at least some user computing devices identified in the one or more network traffic logs, with one or more computer
processors, a plurality of device groups, each device group being defined, at least in part, as a consequence of a subset
of the user computing devices sharing a respective attribute, each subset having a plurality of user computing devices, each
device group being defined by at least one different attribute from the other device groups, at least a plurality of the user
computing devices being un-grouped devices that are not in one of plurality of device groups, wherein at least some of the
user computing devices appear in multiple device groups;

for each device group among the plurality of device groups, selecting, with one or more computer processors, from the respective
device group:

a receiver grouped-device collection to receive a network communication of one or more content items, the receiver grouped-device
collection having a plurality of the user computing devices; and

a reserve grouped-device collection to not receive a network communication of the one or more content items, the reserve grouped-device
collection having a plurality of the user computing devices;

selecting, with one or more computer processors, from the un-grouped devices:
a receiver ungrouped-device collection to receive a network communication of the one or more content items, the receiver ungrouped-device
collection having a plurality of the user computing devices; and

a reserve ungrouped-device collection to not receive a network communication of the one or more content items, the reserve
ungrouped-device collection having a plurality of the user computing devices;

receiving, with one or more computer processors, indications of a network requests for content from more than 1,000 of the
user computing devices and, within less than 500 milliseconds of receiving the indication, for each of the indications, determining:

to send at least some of the one or more content items to the respective user computing device issuing a request causing the
respective indication in response to the respective user computing device being in either the receiver grouped-device collection
or receiver ungrouped-device collection; xor

to not send any of the one or more content items to the respective user computing device issuing a request causing the respective
indication in response to the respective user computing device being in either the reserve grouped-device collection or the
reserve ungrouped-device collection;

storing a result of the determinations in memory;
sending, with one or more computer processors, signals that effectuate the determinations;
obtaining one or more geolocations to be measured for user visits;
obtaining one or more updated network traffic logs after sending at least some of the signals that effectuate the determinations,
the updated network traffic logs including device identifiers of at least some of the more than 10,000 user computing devices
and geolocations from which the at least some of the more than 10,000 user computing devices communicated;

determining visit amounts at the one or more geolocations to be measured for a plurality of the device groups based on geolocations
in the one or more updated network traffic logs; and

storing the determined visit amounts in memory.

US Pat. No. 10,038,968

BRANCHING MOBILE-DEVICE TO SYSTEM-NAMESPACE IDENTIFIER MAPPINGS

PlaceIQ, Inc., New York,...

1. A method of joining data from feeds from multiple sources of computing device network activity data having heterogenous device identifier namespaces and device identifier to device mappings that change over time, the method comprising:accessing, with one or more processors, three or more sources of network activity log data from three or more different sources of network activity data, wherein:
each source of network activity log data describes network activity by more than 100,000 mobile computing devices,
each source of network activity log data describes activities over a duration of time longer than one hour,
each source of network activity log data provides transaction records of more than one 1 million transactions by at least some of the mobile computing devices, each transaction record including one or more external-namespace device identifiers in an external namespace of a respective mobile computing device participating in the respective network transaction, and
the transaction records associate geolocations reported by the mobile computing devices with timestamps and external-namespace device identifiers of the mobile computing devices;
for each of the sources of network activity log data, based the respective network activity log data, updating, with one or more processors, a multi-namespace mapping that maps the external-namespace device identifiers to internal-namespace device identifiers in an internal namespace of a system configured to profile mobile computing devices based on geolocations in logged network activity data of the mobile computing devices, wherein:
the namespace mapping comprises a plurality of external-namespace-specific mappings each mapping a respective type of device identifier in a respective external namespace used in the network activity log data to one or more internal-namespace device identifiers, and
at least some of the external-namespace device identifiers are mapped in at least some of the external-namespace-specific mappings to a plurality of internal-namespace device identifiers, with a given device external-namespace device identifier being mapped to a given plurality of internal-namespace device identifiers;
after updating the multi-namespace mapping, receiving, with one or more processors, an external-namespace device identifier;
selecting, with one or more processors, one of the external-namespace-specific mappings based on the external namespace of the received external-namespace device identifier;
accessing, with one or more processors, a plurality of internal-namespace device identifiers mapped to the received external-namespace device identifier by the selected external-namespace-specific mapping; and
accessing, with one or more processors, a device profile associated with at least some of the plurality internal-namespace device identifiers.

US Pat. No. 10,218,808

SCRIPTING DISTRIBUTED, PARALLEL PROGRAMS

PlaceIQ, Inc., New York,...

1. A method, comprising:obtaining a specification of a data analysis to be performed in parallel on a computing cluster comprising a plurality of computing nodes;
parsing the specification of the data analysis to identify rules applicable to the data analysis;
based on the parsed specification of the data analysis, determining which data is implicated in each portion of the data analysis to be assigned to the plurality of computing nodes of the computing cluster;
determining that a portion of the implicated data is not already present in memory of at least some of the plurality of computing nodes of the computing cluster;
in response to the determination, distributing the portion of the implicated data according to an index that positions related values of the data on the same computing nodes of the computing cluster, wherein distributing the portion of the implicated data comprises calculating index values for tile records based on geographic location such that tile records for adjacent geographic locations are grouped together on the same computing nodes;
translating the parsed specification of the data analysis into mapper rules and reducer rules, at least one of which includes one or more parameters specific to data on a given computing node among the plurality of computing nodes of the computing cluster;
determining which computing nodes of the computing cluster have data relevant to which rules of the mapper rules and the reducer rules and sending the mapper rules and the reducer rules to the corresponding computing nodes for execution in MapReduce routines;
executing the mapper rules and the reducer rules on the corresponding computing nodes; and
aggregating results of executing one or more of the mapper rules and the reducer rules.

US Pat. No. 10,262,330

LOCATION-BASED ANALYTIC PLATFORM AND METHODS

PlaceIQ, Inc., New York,...

1. A method of learning an audience member function, the method comprising:obtaining, with one or more processors, a training set of geographic data describing geolocation histories of a plurality of mobile devices, wherein members of the training set are classified according to whether the respective member of the training set is a member of an audience, and wherein
each geolocation history corresponds to a different user or computing device selected from among a set of more than 100,000 geolocation histories; and
at least some geolocation histories each comprise a respective plurality of timestamped geolocations collected over more than a week;
retrieving, with one or more processors, attributes of geolocations in the geolocation histories from a geographic information system, wherein, for at least some geolocations in the geolocation histories, a plurality of attributes are retrieved for respective geolocations, the attributes each indicating a propensity of users to exhibit a different respective behavior described by the respective attribute in a respective geolocation;
learning, with one or more processors, feature functions of an audience member function based on the training set, wherein at least some of the feature functions are a function of the retrieved attributes of geolocation, wherein the feature functions are learned, at least in part, by calculating a plurality of impurity measures for candidate feature functions and selecting one of the candidate feature functions based on the relative values of the impurity measures, and wherein:
the audience member function is configured to output a score indicative of a probability that a given user is in, or classification of the given user in, the audience;
the audience member function is configured to output the score based on a given input vector, corresponding to the given user;
the given input vector is based on a given geolocation history of the user and has a plurality of dimensions, at least some of the plurality of dimensions being based on at least some of the plurality of attributes; and
the feature functions are learned, at least in part, by performing steps comprising:
selecting a subset of the training set that has a selected dimension larger than a threshold value;
for each of a plurality of other dimensions, and for each of a plurality of values of each of the plurality of other dimensions, calculating an impurity measure corresponding to respective value in the respective other dimension; and
selecting another dimension and a value based on the smallest impurity measure among the calculated impurity measures; and
storing, with one or more processors, the feature functions of the audience member function in an audience repository.

US Pat. No. 10,235,683

ANALYZING MOBILE-DEVICE LOCATION HISTORIES TO CHARACTERIZE CONSUMER BEHAVIOR

PlaceIQ, Inc., New York,...

1. A method of inferring a user's reason for movement between geolocations sensed by a mobile device of the user, the method comprising:obtaining, with one or more processors, a history of time-stamped geolocations of a user, the time-stamped geolocations being obtained based on data from one or more computing devices associated with the user and reported to one or more remote server systems;
selecting, with one or more processors, a plurality of geographic areas based on each of the selected geographic areas including at least one of the time-stamped geolocations;
associating, with one or more processors, the selected geographic areas with the time-stamp of the included geolocation to establish a time series sequence of the geographic areas;
training, with one or more processors, a probabilistic model, wherein training the probabilistic model comprises:
obtaining an initial set of probabilities for the model;
selecting a training set of time series sequences of geographic areas for the user, wherein the training set comprises at least a portion of the established time series sequence of the geographic areas;
selecting a plurality of user events, each user event being an underlying potential reason why users move between geographic locations;
calculating a plurality of probabilities of the model by iterating steps comprising:
estimating, for each pairwise combination of user events in the plurality of user events, a probability of the user transitioning between the pairwise combination of the user events based on the training set and, in a first iteration, the initial set of probabilities and, in a subsequent iteration, a revised set of probabilities;
normalizing the estimated probabilities of the user transitioning between the pairwise combinations of user events to form revised probabilities of the user transitioning between the pairwise combinations of user events;
estimating the probabilities of obtaining a geolocation reported by the computing devices associated with the user based on the training set and the revised probabilities of the user transitioning between the pairwise combinations of the user events; and
normalizing the estimated probabilities of obtaining a geolocation reported by the computing devices associated with the user to form revised probabilities of obtaining a geolocation reported by the computing devices associated with the user,
wherein the revised set of probabilities includes both the revised probabilities of the user transitioning between the pairwise combinations of the user events and the revised probabilities of obtaining a geolocation reported by the computing devices associated with the user;
determining, with the trained probabilistic model, parameters for an input time series sequence of geographic areas based on a recent subset of time-stamped geolocations of the user in the history, the parameters comprising:
a plurality of candidate user events from the plurality of user events, each candidate user event being an underlying potential reason why the user moved between geographic locations;
probabilities of the user transitioning between each pair of the candidate user events; and
probabilities of obtaining a geolocation reported by the computing devices associated with the user in each of the geographic areas in the input time series sequence of the geographic areas following occurrence of each of the candidate user events;
inferring, with one or more processors, a reason why the user transitioned between a given sequential pair of the geographic areas in the input time series sequence of geographic areas responsive to the parameters determined by the trained probabilistic model, wherein the reason comprises one of the candidate user events; and
storing, with one or more processors, the inferred reason in memory.