Cookie Settings
close

By clicking "Accept", you agree to the storage of cookies on your device to improve site navigation, analyze site usage and assist with our marketing efforts. See our privacy policy for more information.

Replica Data Schema

Replica produces large-scale models that accurately represent mobility, economic activity, people, and land use in detail throughout the United States.

Replica customers have access to a number of datasets, which are described below. Click to see detailed descriptions and schema.

Manhattan
Fall 2021
Weekday

In addition to Trends and Places, Replica customers can access a number of datasets that provide utility on their own and can be used in tandem with our modeled datasets.

Data Tables:

  • Land Use Classification 
  • Land Use Parcels
  • Off-street Parking
  • On-street Parking
  • Annual Average Daily Traffic (AADT)
  • Turning Movement Counts (TMC)

Replica uses a multi-level land use classification to allow for aggregations and analysis at different granularities. Details on this classification can be found below in the schema. Please note that not all primary (L1) categories have distinct (L2) sub-categories. In the cases where we do not break down subcategories, the L2 land use inherits directly from L1. In the case of mixed-use parcels, there can be multiple L2 land uses associated with the parcel. 

Primary category (L1)
Sub-categories (L2)
Notes
residential
single_family
multi_family

Mobile homes are counted as “single family”.
We do not surface any distinction between detached and attached single-family homes.

commercial
retail
office
non_retail_attraction

“Non_retail_attraction” refers to any commercial site that does not easily fall into the first two, e.g. stadiums, theaters, commercial recreation, etc.

mixed_use
[any other L2]

A mixed use L1 can have a mix of any other defined L2s, such as:

• “multi_family” and “office”
• “retail” and “industrial”

industrial
industrial

N/A

civic_institutional
healthcare
education
civic_institutional

The “civic_institutional” L2 is a catch-all for any non-healthcare, non-education civic institutional land use, including military bases, administrative buildings, etc.

transportation_utilities
transportation_utilities

N/A

agriculture
agriculture

N/A

open_space
open_space

Includes parks, national forests, and other uncategorized open space.

other
other

Catch-all for known land uses that don’t fall into any other category, e.g. vacant land or construction sites.

unknown
unknown

Parcels/buildings have an “unknown” land use when we have neither disaggregate NOR aggregate ground truth for the tract they are located in.

The land use parcel table contains a nationwide snapshot of parcels, their land use, their total built square footage, and their dwelling unit count. 

Replica models about 150 million disaggregate parcels nationwide. The built square footage is a modeled representation of total three-dimensional building space. There could be multiple buildings per parcel.

File Name
Content Type
Sample Value
Description
parcel_id
Integer
7172757950704223187

Unique identifier for a parcel.

BLOCKGROUP
Text
010150014002

The US Census Bureau-assigned GEOID of the census blockgroup containing the parcel.

TRACT
Integer
01015001400

The US Census Bureau-assigned GEOID of the census tract containing the parcel.

parcel_sqft
Integer
9422

The square footage of the parcel geometry.

built_sqft
Integer
35870

The total, 3-dimensional built square footage across all floors of all buildings on the parcel. Each parcel could have more than one building.Note: since we do not have ground truth 3D area per building, this value is modeled from parcel attributes, building footprints, and aggregate building area totals.

landuse_l1
Text
mixed_use

Top-level land use of the parcel.

Valid values are:
• residential
• commercial
• mixed_use
• civic_institutional
• industrial
• transportation_utilities
• agriculture
• open_space
• other
• unknown

landuse_l2_primary
Text
retail

Sub-category of the top-level (L1) land use. For L1s that do not have sub-categories, this value is the same as the L1. Valid values are:

  • single_family
  • multi_family
  • retail
  • office
  • non_retail_attraction
  • healthcare
  • education
  • civic_institutional
  • industrial
  • transportation_utilities
  • agriculture
  • open_space
  • other
  • unknown

landuse_l2_secondary
Text
multi_family

Null for parcels that do not have a “mixed_use” L1. For mixed-use parcels, this is another subcategory, and shares the same possible values as landuse_l2_primary.

landuse_l2_tertiary
Text
[null]

See landuse_l2_secondary.

dwelling_units
Integer
3

Number of dwelling units on the parcel. It will be 0 for all non-residential, non-mixed-use parcels. It is not guaranteed to be >0 for residential/mixed-use parcels, since our parcel land use data is recent (2021) and the dwelling_units is modeled off of scaled 2010 Census block counts.

lat
Latitude
37.434211

Latitude of the parcel centroid, reported in decimal degrees.

lng
Longitude
-122.16801

Longitude of the parcel centroid, reported in decimal degrees.

geometry
Geography
POLYGON((-122.0426972 39.4026794, -122.0403184 39.4026725, -122.0402856 39.4062184, -122.0426579 39.4062214, -122.0426972 39.4026794))

Parcel geometry.

The lot (Off-Street) parking table covers public and commercial parking lots, and includes details about the number of spaces, rate structure, open hours, operators, and other lot attributes. This dataset does not include private and residential parking.

File Name
Content Type
Sample Value
Description
lot_id
Integer
92181

Unique identifier for off-street parking lot.

lot_polygon
String
o{fsFrlqaSg@{DVMd@B`@lDw@V

Encoded representation of parking lot geometry.

lot_geometry
Geography
POLYGON((-105.2795 40.01708, -105.27863 40.01725, -105.27861 40.01744, -105.27868 40.01756, -105.27962 40.01736, -105.2795 40.01708))

WKT representation of parking lot geometry.

lot_centroid
Geography
POINT(-105.2790708 40.0173172)

WKT POINT (lng, lat) representation of parking lot centroid.

lot_sqft
Integer
29614

The area of the parking lot in square feet.

BLOCKID10
String
080130122043031

2010 census block ID of parking lot centroid.

num_spaces
Integer
97

The number of parking spaces in the parking lot.

num_handicap_spaces
Integer
3

The number of handicap parking spaces in the parking lot.

has_ev_charging
Boolean
false

See landuse_l2_secondary.

dwelling_units
Integer
Surface

Physical description of parking lot (surface, subterranean, etc.).

type
String
Non-restricted

Access restrictions, e.g., non-restricted, or valet only.

category
String
Public Parking

General purpose of lot, e.g., public-parking, or venue parking.

rate_card
String
1 Hour: $3, 1.5 Hours: $5, 2 Hours: $7, 3.5 Hours: $10, 24 Hours: $20

Information on pricing, when available.

open_hours
String
Mon-Sun: 24 Hours

Information on operating hours, when available.

operator
String
State Automobile Mutual Insurance Company

The name of the operating company or agency, when available.

weekday_averages
Integer, repeated (24)
[215, 200, 195, 230, 202, 160….]

The average number of available spaces by hour-of-day for a typical weekday. Hours are reported in the local time zone in 24-hour time format.

weekend_averages
Integer, repeated (24)
[105, 111, 121, 123, 90….]

The average number of available spaces by hour-of-day, for a typical weekend. Hours are reported in the local time zone in 24-hour time format.

The on-street parking table includes the number of parking spaces that are located along individual sections of a block. Coverage is only available in specific concentrated parts of an area (often high-density areas).

File Name
Content Type
Sample Value
Description
segment_id
Integer
5992917043770600482

A unique identifier for on-street parking segment.

osm_id
String
8917298_4

The corresponding OSM segment ID.

segment_polyline
String
ajreFz_njVPA

Encoded representation of parking segment geometry.

segment_geometry
Geography
LINESTRING(-122.465417 37.782247, -122.46541 37.782158)

WKT representation of parking segment geometry.

segment_centroid
Geography
POINT(-122.465413499998 37.7822025000001)

WKT POINT (lng, lat) representation of parking segment centroid.

BLOCKID10
String
060750402001008

2010 census block ID of parking segment centroid.

num_spaces
Integer
2

The number of parking spaces along the segment.

rate_card
String
15 Min (Mon-Fri; 9am-12pm): $0.25, 15 Min (Mon-Fri; 12pm-6pm): $0.56, 15 Min (Sat; 9am-12pm): $0.50, 15 Min (Sat; 12pm-3pm): $0.75, 15 Min (Sat; 3pm-6pm): $0.50, Mon-Sat (6pm-9am): Free, Sun: Free, Tue Every 1st & 3rd (7am-9am; Street Cleaning): No Parking

Information on pricing, when available.

weekday_averages
Integer, repeated (24)
[28, 27, 20, 21, 19, 21….]

The expected probability to find an open spot by hour-of-day, for a typical weekday. Hours are reported in the local time zone in 24-hour time format. Values in this field range between 0 and 100.

weekend_averages
Integer, repeated (24)
[10, 11, 15, 21, 22, 23…]

The expected probability to find an open spot by hour-of-day, for a typical weekend. Hours are reported in the local time zone in 24-hour time format. Values in this field range between 0 and 100.

Replica’s AADT data includes two metrics (1) AADT of motor vehicles (any surface mode) for roadways, and (2) hourly volumes by day of the week, broken out by the hour and averaged annually. Data is available for 2019 and 2021. Both metrics are available for download in shapefile format.

File Name
Content Type
Sample Value
Description
id
String
9994843099764159364

A unique identifier for the network link.

bidirectional
Boolean
true

True if the provided traffic volume is the sum for both directions of an undivided roadway. False if for only one direction.

street_name
String
Isabel Avenue

The common name of the network link if available. Matches the name assigned by OpenStreetMap.

highway
String
trunk

The classification of the link based on OpenStreetMap data.

lanes
Integer
2

The number of travel lanes on the network link.

speed_limit
Integer
60

The speed limit on the link in miles per hour.

length
Float
229.22

The distance (length) of the network link in meters.

heading
Integer
176

The heading of the network link.

compass_direction
String
NE

The compass direction of the network link.

aadt
Integer
494

The annual average daily traffic volumes of the network link.

geometry
Geography
LINESTRING(-97.398509 27.662232, -97.398556 27.662034, -97.398517 27.661988)

The geometry (linestring) for each network link.

vol_{day}_{hour}
Integer
200

Volume of trips on the network link, broken out by day of the week (e.g., MON, FRI), and the hour of the day (0, 1, 2).

Replica’s TMC data includes motor vehicle trip counts at most signalized intersections for each day of the week, bucketed into 1-hour intervals. The data is from Spring of 2022 and is available for download in .csv format.

File Name
Content Type
Sample Value
Description
intersection_id
String
61.5522_-149.5329

Unique ID of the intersection.

intersection_id_geom
Geography
POINT(-149.5329 61.5522)

Lat-long of the intersection in WKT POINT() format.

turn_maneuver
String
through

Maneuver at the intersection, such as left, right, through, u-turn.

movement_direction
String
S_to_W

Start to end compass direction.

inbound_direction
String
E

Inbound compass direction.

inbound_heading
Integer
87

Heading of the inbound road described as degrees from north (0-360).

turning_movement
String
NBL

Turning movement as ‘NBL’ or ‘NBT’.

state
String
AK

US state abbreviation indicating location of intersection.

inbound_stable_edge_id
String
10908660954266870460

A unique identifier for the network link.

inbound_street_name
String
Kenai Spur Highway

The common name of the network link if available. Matches the name assigned by OpenStreetMap.

outbound_osm_id
String
792610357

A unique identifier for the OSM way or supersegment ID. Can be converted to a set of stable edge IDs.

outbound_street_name
String
West Dimond Boulevard

The common name of the network link if available. Matches the name assigned by OpenStreetMap.

hour
Time
01:00:00

Start hour of TMC counts represented in hh:mm:ss.

day
String
Monday

Day of the week.

count
Integer
522

TMC counts scaled from observed numbers to match total volumes along network links.