Replica produces large-scale models that accurately represent mobility, economic activity, people, and land use in detail throughout the United States.
Replica customers have access to a number of datasets, which are described below. Click to see detailed descriptions and schema.
Data Tables:
Replica uses a multi-level land use classification to allow for aggregations and analysis at different granularities. Details on this classification can be found below in the schema. Please note that not all primary (L1) categories have distinct (L2) sub-categories. In the cases where we do not break down subcategories, the L2 land use inherits directly from L1. In the case of mixed-use parcels, there can be multiple L2 land uses associated with the parcel.
Mobile homes are counted as “single family”.
We do not surface any distinction between detached and attached single-family homes.
“Non_retail_attraction” refers to any commercial site that does not easily fall into the first two, e.g. stadiums, theaters, commercial recreation, etc.
A mixed use L1 can have a mix of any other defined L2s, such as:
• “multi_family” and “office”
• “retail” and “industrial”
N/A
The “civic_institutional” L2 is a catch-all for any non-healthcare, non-education civic institutional land use, including military bases, administrative buildings, etc.
N/A
N/A
Includes parks, national forests, and other uncategorized open space.
Catch-all for known land uses that don’t fall into any other category, e.g. vacant land or construction sites.
Parcels/buildings have an “unknown” land use when we have neither disaggregate NOR aggregate ground truth for the tract they are located in.
The land use parcel table contains a nationwide snapshot of parcels, their land use, their total built square footage, and their dwelling unit count.
Replica models about 150 million disaggregate parcels nationwide. The built square footage is a modeled representation of total three-dimensional building space. There could be multiple buildings per parcel.
Unique identifier for a parcel.
The US Census Bureau-assigned GEOID of the census blockgroup containing the parcel.
The US Census Bureau-assigned GEOID of the census tract containing the parcel.
The square footage of the parcel geometry.
The total, 3-dimensional built square footage across all floors of all buildings on the parcel. Each parcel could have more than one building.Note: since we do not have ground truth 3D area per building, this value is modeled from parcel attributes, building footprints, and aggregate building area totals.
Top-level land use of the parcel.
Valid values are:
• residential
• commercial
• mixed_use
• civic_institutional
• industrial
• transportation_utilities
• agriculture
• open_space
• other
• unknown
Sub-category of the top-level (L1) land use. For L1s that do not have sub-categories, this value is the same as the L1. Valid values are:
Null for parcels that do not have a “mixed_use” L1. For mixed-use parcels, this is another subcategory, and shares the same possible values as landuse_l2_primary.
See landuse_l2_secondary.
Number of dwelling units on the parcel. It will be 0 for all non-residential, non-mixed-use parcels. It is not guaranteed to be >0 for residential/mixed-use parcels, since our parcel land use data is recent (2021) and the dwelling_units is modeled off of scaled 2010 Census block counts.
Latitude of the parcel centroid, reported in decimal degrees.
Longitude of the parcel centroid, reported in decimal degrees.
Parcel geometry.
The lot (Off-Street) parking table covers public and commercial parking lots, and includes details about the number of spaces, rate structure, open hours, operators, and other lot attributes. This dataset does not include private and residential parking.
Unique identifier for off-street parking lot.
Encoded representation of parking lot geometry.
WKT representation of parking lot geometry.
WKT POINT (lng, lat) representation of parking lot centroid.
The area of the parking lot in square feet.
2010 census block ID of parking lot centroid.
The number of parking spaces in the parking lot.
The number of handicap parking spaces in the parking lot.
See landuse_l2_secondary.
Physical description of parking lot (surface, subterranean, etc.).
Access restrictions, e.g., non-restricted, or valet only.
General purpose of lot, e.g., public-parking, or venue parking.
Information on pricing, when available.
Information on operating hours, when available.
The name of the operating company or agency, when available.
The average number of available spaces by hour-of-day for a typical weekday. Hours are reported in the local time zone in 24-hour time format.
The average number of available spaces by hour-of-day, for a typical weekend. Hours are reported in the local time zone in 24-hour time format.
The on-street parking table includes the number of parking spaces that are located along individual sections of a block. Coverage is only available in specific concentrated parts of an area (often high-density areas).
A unique identifier for on-street parking segment.
The corresponding OSM segment ID.
Encoded representation of parking segment geometry.
WKT representation of parking segment geometry.
WKT POINT (lng, lat) representation of parking segment centroid.
2010 census block ID of parking segment centroid.
The number of parking spaces along the segment.
Information on pricing, when available.
The expected probability to find an open spot by hour-of-day, for a typical weekday. Hours are reported in the local time zone in 24-hour time format. Values in this field range between 0 and 100.
The expected probability to find an open spot by hour-of-day, for a typical weekend. Hours are reported in the local time zone in 24-hour time format. Values in this field range between 0 and 100.
Replica’s AADT data includes two metrics (1) AADT of motor vehicles (any surface mode) for roadways, and (2) hourly volumes by day of the week, broken out by the hour and averaged annually. Data is available for 2019 and 2021. Both metrics are available for download in shapefile format.
A unique identifier for the network link.
True if the provided traffic volume is the sum for both directions of an undivided roadway. False if for only one direction.
The common name of the network link if available. Matches the name assigned by OpenStreetMap.
The classification of the link based on OpenStreetMap data.
The number of travel lanes on the network link.
The speed limit on the link in miles per hour.
The distance (length) of the network link in meters.
The heading of the network link.
The compass direction of the network link.
The annual average daily traffic volumes of the network link.
The geometry (linestring) for each network link.
Volume of trips on the network link, broken out by day of the week (e.g., MON, FRI), and the hour of the day (0, 1, 2).
Replica’s TMC data includes motor vehicle trip counts at most signalized intersections for each day of the week, bucketed into 1-hour intervals. The data is from Spring of 2022 and is available for download in .csv format.
Unique ID of the intersection.
Lat-long of the intersection in WKT POINT() format.
Maneuver at the intersection, such as left, right, through, u-turn.
Start to end compass direction.
Inbound compass direction.
Heading of the inbound road described as degrees from north (0-360).
Turning movement as ‘NBL’ or ‘NBT’.
US state abbreviation indicating location of intersection.
A unique identifier for the network link.
The common name of the network link if available. Matches the name assigned by OpenStreetMap.
A unique identifier for the OSM way or supersegment ID. Can be converted to a set of stable edge IDs.
The common name of the network link if available. Matches the name assigned by OpenStreetMap.
Start hour of TMC counts represented in hh:mm:ss.
Day of the week.
TMC counts scaled from observed numbers to match total volumes along network links.