Demographic, Economic, and Land Use Modeling
SANDAG uses four integrated models in its demographic, economic, and land use forecasts:
(1) the Demographic and Economic Forecasting Model (DEFM), (2) the Interregional
Commute Model (IRCM), (3) the Urban Development Model (UDM) and (4) the Population
Age, Sex, and Ethnicity Forecast (PASEF), in conjunction with the
Transportation Model.
A noteworthy feature of the forecasting process is the feedback of information from
one model to another. (See Figure 1.) For example, regionwide projections of jobs
and housing from DEFM are used in the IRCM and then the output from the IRCM is
used to adjust the output from DEFM. DEFM then provides the regionwide projections
that serve as the basis for UDM and PASEF. Similarly, data from UDM and PASEF are
major inputs to the transportation model, and then transportation model data are
used in subsequent UDM calculations. A key feature of the modeling system is the
central role that land use and transportation policies play in determining future
travel patterns and the associated location of people, houses, and jobs.
Figure 1: Modeling Process
These interrelated models satisfy the federal requirements specified in the Clean
Air Act Amendments of 1990 and the Safe, Accountable, Flexible, Efficient, Transportation
Equity Act: A Legacy for Users (SAFETEA-LU). These legislative acts mandate that
transportation plans consider the long-range effects of the interaction between
land uses and the transportation system.
Demographic and Economic Forecasting Model (DEFM)
The Demographic and Economic Forecasting Model (DEFM) is comprised of an econometric
model and a demographic model and is currently used for SANDAG’s regionwide projections.
DEFM produces an annual forecast of the size and structure of the region’s economy
as well as a corresponding demographic forecast. For the economic forecast, DEFM
relates historical changes in the region’s economy to historical changes in the
national economy using a series of econometric equations that are interrelated (also
known as a simultaneous econometric model). The demographic module uses a cohort-survival
model to forecast population by age, gender, and ethnicity. DEFM produces a wealth
of data about the region’s future economic and demographic characteristics. Among
the more important elements are the size and composition of the population, employment
by industrial sector, household and personal income, housing units by structure
type, vacancy status and persons per household, labor force, and school enrollment.
The initial concept of DEFM in the late 1970s was a result of a cooperative modeling
effort between SANDAG and the County of San Diego that combined various forecasting
tools. With some improvements and modifications, the first version of the model
was successfully used for 18 years. Since then, DEFM has been expanded and revised
to improve performance.
Model Structure:
DEFM is designed to forecast population and economic variables for the region. To
forecast demographic variables, DEFM considers factors such as birth rates, survival
rates, and the age, sex, and ethnic distributions of the resident population. Economic
variables including employment, income, and housing supply are forecast based on
assumptions about national, state, and local growth patterns and inter-industry
relationships.
There are many linkages, both direct and indirect, between the demographic and economic
variables that are accounted for and modeled by DEFM. For example, the population
determines housing demand, demand for public facilities, and associated public finance
projections. Economic activity, as measured by employment and output, depends in
part on the size of the local population and income level. Income, in turn, depends
in part on employment and labor market conditions. Over time, the population responds
to economic conditions as is evident from net migration levels. Thus, the region’s
economic activity depends on the local population, but the local population also
depends on economic activity. DEFM is designed to capture the main interdependencies
and interactions that exist in the region’s economy. The major linkages between
the demographic and economic sectors are illustrated in Figure 2 below.
Figure 2: Demographic and Economic Sector Linkages
Demographics
The demographic forecasts are based on age, gender, and ethnic detail of population
in the most recent estimate year. With this baseline data, DEFM forecasts population
for single-year of age categories by gender and ethnicity for the civilian population
and total population. Total population is equal to civilian population plus uniformed
military population and military dependents, however, data about the military population
including age, gender, and ethnic composition are exogenous inputs to the model
with demographic characteristics that are derived from Census data. Military populations
are treated differently in the model because they tend to comprise an ever-changing
group of people with similar demographic characteristics, as personnel and their
dependents move into and out of the San Diego region. Civilian populations, on the
other hand, are less mobile and tend to experience life course events within the
region throughout the forecast.
To estimate these three population components separately, the civilian population
is determined by subtracting the number of military in uniform from the total population,
the adjusted civilian population is determined by subtracting the military dependent
population from the civilian population.
DEFM also considers labor force, group quarters population, and calculates household
population to provide a more complete picture of the region’s population. To calculate
the region’s labor force (nonmilitary residents of working age that are actively
employed or seeking employment), DEFM multiplies civilian population in each age,
gender, and ethnic group by a natural labor force participation rate. The group
quarters population is comprised of uniformed military living in barracks or onboard
ship, college students living in dormitories, people living in boarding houses,
homes for the disabled, rest homes, prisons, and other group living situations.
To calculate household populations and total households, DEFM subtracts group quarters
population from the total population. Household headship rates, or the portion of
the population that are household heads, are used to determine the number of households.
Economic Activity
In DEFM, the level of economic activity is modeled in terms of employment and output
for 50 industries in San Diego County and an additional six activity categories
for government. The 50 private-sector industries selected represent the most important
activities in the region. For each industry, output provides a direct measure of
the quantity of goods and services produced in that industry. Employment represents
the main input into the production process. Output is derived according to employment
levels and labor productivity forecasts.
To model local economic activity, DEFM uses a market index that is representative
of local, state, and national economic conditions. The market index consists of
two parts. The first captures trends in demand for output in the San Diego region
while the second part captures the relative competitiveness of goods and services
produced in the San Diego region. Together, the combined index reflects a combination
of demand and supply factors that determine the market for regional employment and
the local share of the national market.
Construction:
The construction sector covers the level of residential building activity and residential
and nonresidential construction values. Activity levels and values are based on
local demand for homes and nonresidential building and national economic conditions,
captured by national construction levels.
The variables that DEFM captures in the housing supply component include housing
supply, vacancy rates, housing unit authorizations, construction value, and nonresidential
construction. Housing units are forecast by structure type as a function of the
previous year’s stock plus the completion of last year’s residential permit authorizations.
For single family and multifamily housing stocks, DEFM uses historical permit realization
rates. The stock of manufactured housing units (mobile homes) is held constant at
the base-year amount, minus any units lost to redevelopment.
Revenues & Expenditures:
DEFM contains data on revenue and expenditures for the San Diego region as well
as jurisdictions and school districts within the region. Revenue includes property
taxes, retail sales, federal and state grants and fines, service fees, assessments,
and so on. Expenditures include educational costs and other local government expenditures.
Prices:
DEFM includes four price variables – the consumer price index (CPI), the average
price of a single family home, a construction cost index, and a wage rate index.
All real income and price data in DEFM are based on constant (year 2000) dollars.
The consumer price index, which measures average change over time of prices paid
for consumer goods and services purchased by households, is projected to follow
the national rate of inflation, except for the home price component which may cause
the region’s inflation rate to rise faster (or slower) than the national rate. Housing
price level forecasts are based on a historical relationship with an adjusted income
variable. Construction costs are modeled as a function of a national cost index
but modified for relative wage levels in the region. DEFM also captures construction
activity in the region which is computed based on dwelling unit authorizations per
household in the region. When the ratio is high, construction costs will be slightly
higher, reflecting high levels of demand, when this ratio is low, slack conditions
will result in reduced construction cost. The wage rate index for the San Diego
region is modeled as a function of the U.S. earnings index.
Income:
Personal income forecasts for San Diego County residents are developed for the following
income sources: payroll (wage and salary income), other labor income, proprietor’s
income, dividends, interest and rent, transfer payment source, net of social security
payments.
Interregional Commute Model (IRCM)
The Interregional Commute Model (IRCM) accounts for individuals who work in the
region but live outside its boundaries. The IRCM predicts the future residential
location of the workers holding new jobs created in the San Diego region. The residential
location can be either inside the San Diego region, in Orange County, southwest
Riverside County, Imperial County, or in Tijuana/Northern Baja California. The main
result from the IRCM is the number of new housing units containing workers who are
employed in the San Diego region that will be built in the region and the number
that will be built in one of the four surrounding regions.
Model Structure
The IRCM assigns the residential location of workers based upon the accessibility
of potential residential sites to job locations, the availability of residential
land for development, and the relative price of homes. There are three basic tenets
of the IRCM. These three basic tenets also underlie the gravity model used in the
Urban Development Model (UDM).
IRCM Tenets
- As commuting time from work to possible residential locations increases, the probability
of choosing those locations decreases.
- More land available for residential development increases the potential for residential
growth.
- Lower home prices are also an attraction factor in residential location.
The model begins with the expected increase in jobs by Major Statistical Area (MSA)
during a specified time period. MSAs divide the San Diego region into seven major
subregions. The data for job growth come from the regionwide forecast (DEFM) and
is allocated to the MSAs proportionally based on their potential for employment
growth. Using the allocation probabilities, workers are assigned to residential
locations. In the next step, workers are converted to housing units using a regional
workers-per-household factor that is derived from the regionwide forecast. This
factor accounts for the likelihood that each house may be home to more than one
worker, and is calculated as the ratio of the change in jobs to change in housing
units during the specified forecast period. The allocation of housing units to each
MSA and to surrounding regions is then compared to the housing capacity of each
geographic area to ensure that the housing units do not exceed capacity. In the
event that housing units exceed the capacity, the excess is subtracted from the
allocation and reconverted into workers. These are unallocated workers.
At this point, the beginning capacities are replaced with the remaining housing
unit capacities and the normalized probabilities of place of work to place of residence
are recalculated. (Note that MSAs that reached their initial capacities now have
a housing capacity of zero and therefore, a normalized probability of place-of-work
to place-of-residence of zero.) The unassigned housing units then are allocated
in this second iteration of the model. These procedures are repeated until all housing
units are allocated to MSAs or areas outside the region.
Accessibility:
Accessibility is measured in terms of travel times from potential residential sites
to potential work sites. IRCM uses the most current travel times produced by the
transportation model. Within the San Diego region, the IRCM uses seven Major Statistical
Areas (MSAs) as the geographic analysis units. The seven MSAs are: North County
West, North County East, East Suburban, East County, North City, Central, South
Suburban. MSAs were selected because they are comparable in size to the four external
regions in the model(e.g. southwest Riverside County and Tijuana/Northern Baja California),
and Traffic Analysis Zones (TAZs) nest within them. The model’s travel-time matrix
is based on average TAZ travel times between MSAs.
The travel time assumptions also include commuting time from surrounding regions.
Commute times are calculated from each MSA to the borders between the San Diego
region and the other four regions including: Riverside County, Orange County, Imperial
County, and Northern Baja California, and include travel time on local roads outside
the San Diego region as well as any projected border crossing wait times.
Availability of Residential Land:
Besides commuting times, the availability of residential land has significant influence
on the allocation of future workers in the IRCM. An estimate of potential residential
activity in the San Diego region is derived from land use policy inputs provided
by local jurisdictions, and measures the number of additional housing units that
could potentially be built under current general plan and zoning assumptions. SANDAG
works with other metropolitan planning organizations like the Southern California
Association of Governments (SCAG) to obtain data on potential residential activities
in surrounding regions.
Relative Price of Housing:
In addition to commuting times and the availability of residential land, home prices
have an effect on residential location in the IRCM. Lower-priced homes are an attraction
factor in the model. Home price data for the region is collected at the zip code
level for a series of years and is aggregated up to the MSA level to produce a multi-year
average. Data on home prices are also collected for the surrounding regions of Orange
County, Imperial County, Riverside County, and Tijuana. The home prices for surrounding
regions are turned into an Attractiveness Ratio which is measured as the San Diego
County sales price divided by the area’s sale price. If an area has an average home
price lower than the county average, it will have an Attractiveness Ratio greater
than one and vice versa.
Travel Time and Allocation:
The travel times are converted into probabilities by comparing the length of the
commute to the likelihood of making the commute. These probabilities are derived
from a mathematical function that relates the probability of commuting a specific
distance to the log of the time it takes to make that commute. These commute probabilities
are then multiplied by the potential residential activity within each area and by
the housing price attractiveness factor. What results are allocation probabilities
of working in one area and living in another are that take into account commute
times and probabilities and the availability of land for residential activity.
Urban Development Model (UDM)
The Urban Development Model (UDM) allocates employment, population, housing and
income from the regional forecast produced by DEFM to neighborhoods and jurisdictions
within the region. The model is designed to forecast the location of residential
and non-residential activity within the region for 5 year periods. Major model inputs
include the current spatial distribution of jobs, housing units, income, and population.
Land use data collected from local jurisdictions including general plans, policies,
and current and future transportation infrastructure are also critical to the model.
UDM also satisfies the federal requirements specified in the Clean Air Act Amendments
of 1990 and the Safe, Accountable, Flexible, Efficient, Transportation Equity Act:
A Legacy for Users (SAFETEA-LU). These legislative acts mandate that transportation
plans consider the long-range effects of the interaction between land uses and the
transportation system.
Model Structure:
UDM has three major components. The first component allocates regional employment.
The second component determines the location of residential activity, based on the
spatial distribution of employment from the first component. The final component
of UDM provides a forecast of other demographic and economic characteristics including
occupied units, population, household income, and employment by industrial classification.
UDM uses a two-step allocation procedure during which it first allocates activities
to 127 Zones for Urban Modeling (ZUM), which are generally groups of census tracts
that conform to jurisdictions or community planning areas. In the second step of
allocation, UDM allocates the ZUM forecast to over 800,000 parcel polygons which
are geographic areas based on assesor’s parcels.
Figure 1 illustrates the major components of UDM and its relationships to the regional
forecast and transportation model. UDM provides a forecast at 5-year periods. Figure
1 illustrates the relationships between two periods, shown as Time Period n
and Time Period n+5 years.
UDM captures the link between work place location and residential location through
commuting patterns and travel times for both highways and transit within the region
supplied by the transportation model. UDM uses a probability distribution to allocate
housing units based on work location around the region.
UDM combines the transportation and land use factors mathematically to determine
the likelihood that an employee at his or her place of work will reside in alternative
residential locations around the region. In general, areas closer to employment
opportunities are more attractive to employees as potential residences than areas
further away from the place of employment. Therefore, as available residential capacity
closer to work places is consumed, new employees are forced to travel longer distances
to find suitable residential locations. Residential growth in a jurisdiction is
influenced by growth within that jurisdiction as well is in surrounding areas and
other parts of the region.
After UDM determines the residential location of employed residents, it uses several
local factors to derive household, housing stock, and population. UDM forecasts
these local factors based on regional trends and other indicators including housing
mix (i.e. single family units, multifamily units, and mobile home).
Figure 1: Key Components of UDM
Step 1: Employment Allocation (Civilian)
Employment allocation is the first model component. During this step, UMD determines
the location of civilian employment based on the distribution of existing and previously
forecasted employment in addition to considering opportunities for employment development
throughout the region and travel times. UDM then estimates the civilian employment
share for a particular ZUM rather than the actual number of jobs. In order to convert
these employment shares to employment numbers, UDM multiplies the share by the civilian
employment forecast for that ZUM. During this allocation process, the model also
accounts for site-specific activities (projects that are known to be under construction
and are an input to the model, rather than projected using forecasting techniques).
UDM then determines the employment change in each ZUM by subtracting base year employment
from the employment forecast.
UDM next allocates employment change to parcels based on their development priority,
defined as accessibility to residential and civilian employment activities. At the
parcel level, the model must consider what the planned land use of a parcel is.
If the development is to occur on land that is designed as employment infill, no
land use accounting is required. However, if the development will occur on land
designated as mixed use, vacant, or something other than employment infill, the
model must determine the number of acres changing land uses.
Step 2: Housing Allocation
In the next module, UDM determines the location of residential activity. The model
distributes single family stock, multifamily stock, and mobile homes. In order to
complete this step, the model relies on the forecast of employment, commuting probabilities
and travel times, and opportunities for residential development based on capacity
(where capacity is an estimate of the number and type of additional units that can
reasonably be built on a parcel, given existing conditions, land use plans, and
constraints to development such as steep slopes). UDM considers four major factors
in forecasting residential activity.
- Residential activities are primarily determined by employment location.
- The longer the work trip, the less likely that a person will make that trip.
- The more land available for residential development, the greater potential for residential
growth.
- Residential growth occurs only in area with capacity for residential growth.
To allocate housing at the ZUM level, UDM first determines the residential location
of workers throughout the region. This step is repeated twice; once for workers
in single family home and multi-family homes. The allocated workers are referred
to as “employed residents”. Next, UDM determines the housing units needed to house
the employed residents based on an employed residents per household rate. This ZUM-specific
rate considers local unemployment rates, multiple-worker households, labor force
participation rates, age, structure, and income. Using this rate, UDM determines
the number of households needed to accommodate the forecast of employed residents.
The ZUM forecast of households is a temporary output that is finalized in the last
module of UDM. After determining the necessary number of housing units, UDM distributes
the housing stock change to general land use categories for the single-family and
multifamily allocation based on the land use types and capacities within a ZUM.
In the last step of the module, UDM allocates mobile homes to ZUMS. The ZUM housing
stock forecast for each structure type matches the ZUM forecast of total housing
stock.
Similar to the parcel allocation for civilian employment, UDM allocates housing
stock change to parcels based on their accessibility order (development priority).
This is done separately for single family and multi-family units and their land
use categories. To complete the housing stock forecast, UDM allocates the ZUM mobile
home forecast to parcels. Similar to the land use accounting in civilian employment,
the model then determines the acreage change in land use.
Step 3: Assigning socio-economic characteristics at the sub-regional level
UDM uses the civilian employment and housing stock forecasts as input for the allocation
of the other characteristics including occupied units by structure type (single
family, multifamily, and mobile homes), household and group quarters populations,
employed residents, household income distribution, and civilian employment by industrial
category. UDM produces ten household income categories, and fourteen civilian employment
categories. It also provides a forecast of uniformed military employment. Changes
in uniformed military employment are external to the model and are treated as site-specific
activities.
ZUM occupied housing units are calculated by adding together units by structure
type to get total occupied units. The same is true for MGRA occupied units. For
both geographies, a vacancy rate is applied.
Population by Age, Sex, and Ethnicity Forecast (PASEF)
Model Overview
The program for forecasting detailed demographic characteristics (age, sex, and
ethnicity - PASEF) is a demographic model designed to forecast detailed demographic
characteristics at a neighborhood level. The detailed demographic forecast comes
directly from DEFM, but requires aggregating the single year of age detail into
the five-year age groups used in PASEF, and an adjustment for special populations.
The model projects population for 18 five-year age groups (0-4, 5-9…,80-84, and
85+) broken down by gender and ethnicity for the region and smaller geographies.
Special Populations:
The forecast technique accounts separately for special populations which include
military and college population. Special populations are treated differently from
non-special populations because their characteristics remain relatively stable over
time. Therefore, while PASEF incorporates changes in the overall size of the special
population, it assumes that their age, sex, and ethnicity profile remain unchanged
over time. PASEF forecasts special populations using a bottom-up method whereby
census tracts are forecast first, then aggregated to sub-regional areas (SRAs) and
the region. The regional and SRA special population estimates are then used in the
calculation of non-special population estimates.
Non-Special Populations:
The non-special population forecast uses a top-down method – first for the region,
then for SRAs, next for census tracts, and finally for MGRAs, with the larger geographic
areas serving as controls . PASEF derives the non-special demographic characteristics
population for the region by subtracting the special population estimates. For purposes
of controlling, PASEF also creates a regional non-special population estimate by
sex and ethnic group.
A two step method provides the non-special population forecasts for SRAs. The first
step computes the sex and ethnic composition, and the second step computes the age
composition within each sex and ethnic group.
The sex and ethnic composition is based on the change in the sex and ethnic group
shares for each SRA and is an exogenous input based on historical trends. These
trends are derived from Census information and SANDAG’s latest detailed demographic
characteristic estimates. PASEF next computes the non-special population by age
within each sex and ethnic group.
During the second step, PASEF controls the age forecast within each sex and ethnic
group to the non-special demographic characteristic forecast for the SRA.
For census tracts classified as non-special, PASEF starts with the initial demographic
characteristics estimates previously developed, and for controlling purposes, creates
a non-special population estimate by sex and ethnic group. Next, PASEF uses the
same 2-step controlling method used for SRAs to derive the sex and ethnicity estimates.
The final stage in PASEF distributes the demographic characteristics estimates from
the census tracts to the MGRAs. The model assumes that each MGRA has the same demographic
characteristic distribution as the census tract in which it lies.
Validation and Calibration
The demographic, economic, and land use forecasts are developed in a collaborative
process. SANDAG staff works closely with a wide range of professionals outside the
agency when preparing forecasts. For the regional forecast (DEFM), SANDAG convenes
a Regionwide Forecast Technical Advisory Working Group, which is composed of experts
in demography, housing, economics, and other disciplines from state and local agencies,
local universities, and the private sector. This committee is responsible for reviewing
the regional model structure, data inputs, and assumptions. Feedback from the committee
is incorporated into the model. The committee also evaluates the forecast results.
With the DEFM forecasts, SANDAG has a track record of less than 0.5 percent error,
on average, per forecast year.
SANDAG also relies on the Regional Planning Technical Working Group for advice on
the forecast, which provides information for jurisdictions, communities and other
areas within the region. This working group comprises the local jurisdictions’ planning
directors or their designees and representatives from other agencies within the
region that use the forecast data for facility and infrastructure planning. This
working group assists with local land use assumptions that are among the most important
inputs to the forecasting process.
Modeling Software
The DEFM model relies upon proprietary software, MetrixND, licensed from ITRON.
IRCM is a spreadsheet-based model. Software that implements UDM and PASEF has been
developed in C#. Tables with model data were created and stored in Microsoft SQL
Server databases.
Data Sources
These models require a wealth of data from a variety of sources. These sources are
outlined in the table below:
Data
|
Source(s)
|
Model(s)
|
Housing
|
U.S. Census Bureau, San Diego County Assessor, local jurisdictions
|
DEFM, UDM
|
Jobs (by industry)
|
U.S. Bureau of Labor Statistics, California Employment Development Department, U.S.
Department of Defense, local jurisdictions
|
DEFM, UDM
|
Labor market (employment, unemployment, labor force participation)
|
U.S. Bureau of Labor Statistics
|
DEFM
|
Population and demographic characteristics
|
U.S. Census Bureau, California Department of Finance
|
DEFM, UDM, PASEF
|
Price levels and inflation
|
U.S. Bureau of Labor Statistics, National Association of Realtors, DataQuick Information
Systems
|
DEFM, IRCM
|
Public finance
|
California Department of Finance
|
DEFM
|
Travel times
|
SANDAG transportation model
|
IRCM, UDM
|
United States projections
|
U.S. Census Bureau, and economic projections purchased from private-sector vendor
(varies depending on series)
|
DEFM
|
Vital records (births, deaths)
|
California Department of Health
|
DEFM
|