Demographic and Land Use Model
SANDAG uses four integrated models in its demographic, economic, and land use forecasts.
Demographic, Economic, and Land Use Modeling
SANDAG uses four integrated models in its demographic, economic, and land use forecasts: (1) the Demographic and Economic Forecasting Model (DEFM), (2) the Interregional Commute Model (IRCM), (3) the Urban Development Model (UDM) and (4) the Population Age, Sex, and Ethnicity Forecast (PASEF), in conjunction with the Transportation Model.
A noteworthy feature of the forecasting process is the feedback of information from one model to another. (See Figure 1.) For example, regionwide projections of jobs and housing from DEFM are used in the IRCM and then the output from the IRCM is used to adjust the output from DEFM. DEFM then provides the regionwide projections that serve as the basis for UDM and PASEF. Similarly, data from UDM and PASEF are major inputs to the transportation model, and then transportation model data are used in subsequent UDM calculations. A key feature of the modeling system is the central role that land use and transportation policies play in determining future travel patterns and the associated location of people, houses, and jobs.
Figure 1: Modeling Process
These interrelated models satisfy the federal requirements specified in the Clean Air Act Amendments of 1990 and the Safe, Accountable, Flexible, Efficient, Transportation Equity Act: A Legacy for Users (SAFETEA-LU). These legislative acts mandate that transportation plans consider the long-range effects of the interaction between land uses and the transportation system.
Demographic and Economic Forecasting Model (DEFM)
The Demographic and Economic Forecasting Model (DEFM) is comprised of an econometric model and a demographic model and is currently used for SANDAG’s regionwide projections. DEFM produces an annual forecast of the size and structure of the region’s economy as well as a corresponding demographic forecast. For the economic forecast, DEFM relates historical changes in the region’s economy to historical changes in the national economy using a series of econometric equations that are interrelated (also known as a simultaneous econometric model). The demographic module uses a cohort-survival model to forecast population by age, gender, and ethnicity. DEFM produces a wealth of data about the region’s future economic and demographic characteristics. Among the more important elements are the size and composition of the population, employment by industrial sector, household and personal income, housing units by structure type, vacancy status and persons per household, labor force, and school enrollment.
The initial concept of DEFM in the late 1970s was a result of a cooperative modeling effort between SANDAG and the County of San Diego that combined various forecasting tools. With some improvements and modifications, the first version of the model was successfully used for 18 years. Since then, DEFM has been expanded and revised to improve performance.
DEFM is designed to forecast population and economic variables for the region. To forecast demographic variables, DEFM considers factors such as birth rates, survival rates, and the age, sex, and ethnic distributions of the resident population. Economic variables including employment, income, and housing supply are forecast based on assumptions about national, state, and local growth patterns and inter-industry relationships.
There are many linkages, both direct and indirect, between the demographic and economic variables that are accounted for and modeled by DEFM. For example, the population determines housing demand, demand for public facilities, and associated public finance projections. Economic activity, as measured by employment and output, depends in part on the size of the local population and income level. Income, in turn, depends in part on employment and labor market conditions. Over time, the population responds to economic conditions as is evident from net migration levels. Thus, the region’s economic activity depends on the local population, but the local population also depends on economic activity. DEFM is designed to capture the main interdependencies and interactions that exist in the region’s economy. The major linkages between the demographic and economic sectors are illustrated in Figure 2 below.
Figure 2: Demographic and Economic Sector Linkages
The demographic forecasts are based on age, gender, and ethnic detail of population in the most recent estimate year. With this baseline data, DEFM forecasts population for single-year of age categories by gender and ethnicity for the civilian population and total population. Total population is equal to civilian population plus uniformed military population and military dependents, however, data about the military population including age, gender, and ethnic composition are exogenous inputs to the model with demographic characteristics that are derived from Census data. Military populations are treated differently in the model because they tend to comprise an ever-changing group of people with similar demographic characteristics, as personnel and their dependents move into and out of the San Diego region. Civilian populations, on the other hand, are less mobile and tend to experience life course events within the region throughout the forecast.
To estimate these three population components separately, the civilian population is determined by subtracting the number of military in uniform from the total population, the adjusted civilian population is determined by subtracting the military dependent population from the civilian population.
DEFM also considers labor force, group quarters population, and calculates household population to provide a more complete picture of the region’s population. To calculate the region’s labor force (nonmilitary residents of working age that are actively employed or seeking employment), DEFM multiplies civilian population in each age, gender, and ethnic group by a natural labor force participation rate. The group quarters population is comprised of uniformed military living in barracks or onboard ship, college students living in dormitories, people living in boarding houses, homes for the disabled, rest homes, prisons, and other group living situations. To calculate household populations and total households, DEFM subtracts group quarters population from the total population. Household headship rates, or the portion of the population that are household heads, are used to determine the number of households.
In DEFM, the level of economic activity is modeled in terms of employment and output for 50 industries in San Diego County and an additional six activity categories for government. The 50 private-sector industries selected represent the most important activities in the region. For each industry, output provides a direct measure of the quantity of goods and services produced in that industry. Employment represents the main input into the production process. Output is derived according to employment levels and labor productivity forecasts.
To model local economic activity, DEFM uses a market index that is representative of local, state, and national economic conditions. The market index consists of two parts. The first captures trends in demand for output in the San Diego region while the second part captures the relative competitiveness of goods and services produced in the San Diego region. Together, the combined index reflects a combination of demand and supply factors that determine the market for regional employment and the local share of the national market.
The construction sector covers the level of residential building activity and residential and nonresidential construction values. Activity levels and values are based on local demand for homes and nonresidential building and national economic conditions, captured by national construction levels.
The variables that DEFM captures in the housing supply component include housing supply, vacancy rates, housing unit authorizations, construction value, and nonresidential construction. Housing units are forecast by structure type as a function of the previous year’s stock plus the completion of last year’s residential permit authorizations. For single family and multifamily housing stocks, DEFM uses historical permit realization rates. The stock of manufactured housing units (mobile homes) is held constant at the base-year amount, minus any units lost to redevelopment.
Revenues & Expenditures:
DEFM contains data on revenue and expenditures for the San Diego region as well as jurisdictions and school districts within the region. Revenue includes property taxes, retail sales, federal and state grants and fines, service fees, assessments, and so on. Expenditures include educational costs and other local government expenditures.
DEFM includes four price variables – the consumer price index (CPI), the average price of a single family home, a construction cost index, and a wage rate index. All real income and price data in DEFM are based on constant (year 2000) dollars.
The consumer price index, which measures average change over time of prices paid for consumer goods and services purchased by households, is projected to follow the national rate of inflation, except for the home price component which may cause the region’s inflation rate to rise faster (or slower) than the national rate. Housing price level forecasts are based on a historical relationship with an adjusted income variable. Construction costs are modeled as a function of a national cost index but modified for relative wage levels in the region. DEFM also captures construction activity in the region which is computed based on dwelling unit authorizations per household in the region. When the ratio is high, construction costs will be slightly higher, reflecting high levels of demand, when this ratio is low, slack conditions will result in reduced construction cost. The wage rate index for the San Diego region is modeled as a function of the U.S. earnings index.
Personal income forecasts for San Diego County residents are developed for the following income sources: payroll (wage and salary income), other labor income, proprietor’s income, dividends, interest and rent, transfer payment source, net of social security payments.
Interregional Commute Model (IRCM)
The Interregional Commute Model (IRCM) accounts for individuals who work in the region but live outside its boundaries. The IRCM predicts the future residential location of the workers holding new jobs created in the San Diego region. The residential location can be either inside the San Diego region, in Orange County, southwest Riverside County, Imperial County, or in Tijuana/Northern Baja California. The main result from the IRCM is the number of new housing units containing workers who are employed in the San Diego region that will be built in the region and the number that will be built in one of the four surrounding regions.
The IRCM assigns the residential location of workers based upon the accessibility of potential residential sites to job locations, the availability of residential land for development, and the relative price of homes. There are three basic tenets of the IRCM. These three basic tenets also underlie the gravity model used in the Urban Development Model (UDM).
- As commuting time from work to possible residential locations increases, the probability of choosing those locations decreases.
- More land available for residential development increases the potential for residential growth.
- Lower home prices are also an attraction factor in residential location.
The model begins with the expected increase in jobs by Major Statistical Area (MSA) during a specified time period. MSAs divide the San Diego region into seven major subregions. The data for job growth come from the regionwide forecast (DEFM) and is allocated to the MSAs proportionally based on their potential for employment growth. Using the allocation probabilities, workers are assigned to residential locations. In the next step, workers are converted to housing units using a regional workers-per-household factor that is derived from the regionwide forecast. This factor accounts for the likelihood that each house may be home to more than one worker, and is calculated as the ratio of the change in jobs to change in housing units during the specified forecast period. The allocation of housing units to each MSA and to surrounding regions is then compared to the housing capacity of each geographic area to ensure that the housing units do not exceed capacity. In the event that housing units exceed the capacity, the excess is subtracted from the allocation and reconverted into workers. These are unallocated workers.
At this point, the beginning capacities are replaced with the remaining housing unit capacities and the normalized probabilities of place of work to place of residence are recalculated. (Note that MSAs that reached their initial capacities now have a housing capacity of zero and therefore, a normalized probability of place-of-work to place-of-residence of zero.) The unassigned housing units then are allocated in this second iteration of the model. These procedures are repeated until all housing units are allocated to MSAs or areas outside the region.
Accessibility is measured in terms of travel times from potential residential sites to potential work sites. IRCM uses the most current travel times produced by the transportation model. Within the San Diego region, the IRCM uses seven Major Statistical Areas (MSAs) as the geographic analysis units. The seven MSAs are: North County West, North County East, East Suburban, East County, North City, Central, South Suburban. MSAs were selected because they are comparable in size to the four external regions in the model(e.g. southwest Riverside County and Tijuana/Northern Baja California), and Traffic Analysis Zones (TAZs) nest within them. The model’s travel-time matrix is based on average TAZ travel times between MSAs.
The travel time assumptions also include commuting time from surrounding regions. Commute times are calculated from each MSA to the borders between the San Diego region and the other four regions including: Riverside County, Orange County, Imperial County, and Northern Baja California, and include travel time on local roads outside the San Diego region as well as any projected border crossing wait times.
Availability of Residential Land:
Besides commuting times, the availability of residential land has significant influence on the allocation of future workers in the IRCM. An estimate of potential residential activity in the San Diego region is derived from land use policy inputs provided by local jurisdictions, and measures the number of additional housing units that could potentially be built under current general plan and zoning assumptions. SANDAG works with other metropolitan planning organizations like the Southern California Association of Governments (SCAG) to obtain data on potential residential activities in surrounding regions.
Relative Price of Housing:
In addition to commuting times and the availability of residential land, home prices have an effect on residential location in the IRCM. Lower-priced homes are an attraction factor in the model. Home price data for the region is collected at the zip code level for a series of years and is aggregated up to the MSA level to produce a multi-year average. Data on home prices are also collected for the surrounding regions of Orange County, Imperial County, Riverside County, and Tijuana. The home prices for surrounding regions are turned into an Attractiveness Ratio which is measured as the San Diego County sales price divided by the area’s sale price. If an area has an average home price lower than the county average, it will have an Attractiveness Ratio greater than one and vice versa.
Travel Time and Allocation:
The travel times are converted into probabilities by comparing the length of the commute to the likelihood of making the commute. These probabilities are derived from a mathematical function that relates the probability of commuting a specific distance to the log of the time it takes to make that commute. These commute probabilities are then multiplied by the potential residential activity within each area and by the housing price attractiveness factor. What results are allocation probabilities of working in one area and living in another are that take into account commute times and probabilities and the availability of land for residential activity.
Urban Development Model (UDM)
The Urban Development Model (UDM) allocates employment, population, housing and income from the regional forecast produced by DEFM to neighborhoods and jurisdictions within the region. The model is designed to forecast the location of residential and non-residential activity within the region for 5 year periods. Major model inputs include the current spatial distribution of jobs, housing units, income, and population. Land use data collected from local jurisdictions including general plans, policies, and current and future transportation infrastructure are also critical to the model.
UDM also satisfies the federal requirements specified in the Clean Air Act Amendments of 1990 and the Safe, Accountable, Flexible, Efficient, Transportation Equity Act: A Legacy for Users (SAFETEA-LU). These legislative acts mandate that transportation plans consider the long-range effects of the interaction between land uses and the transportation system.
UDM has three major components. The first component allocates regional employment. The second component determines the location of residential activity, based on the spatial distribution of employment from the first component. The final component of UDM provides a forecast of other demographic and economic characteristics including occupied units, population, household income, and employment by industrial classification.
UDM uses a two-step allocation procedure during which it first allocates activities to 127 Zones for Urban Modeling (ZUM), which are generally groups of census tracts that conform to jurisdictions or community planning areas. In the second step of allocation, UDM allocates the ZUM forecast to over 800,000 parcel polygons which are geographic areas based on assesor’s parcels.
Figure 1 illustrates the major components of UDM and its relationships to the regional forecast and transportation model. UDM provides a forecast at 5-year periods. Figure 1 illustrates the relationships between two periods, shown as Time Period n and Time Period n+5 years.
UDM captures the link between work place location and residential location through commuting patterns and travel times for both highways and transit within the region supplied by the transportation model. UDM uses a probability distribution to allocate housing units based on work location around the region.
UDM combines the transportation and land use factors mathematically to determine the likelihood that an employee at his or her place of work will reside in alternative residential locations around the region. In general, areas closer to employment opportunities are more attractive to employees as potential residences than areas further away from the place of employment. Therefore, as available residential capacity closer to work places is consumed, new employees are forced to travel longer distances to find suitable residential locations. Residential growth in a jurisdiction is influenced by growth within that jurisdiction as well is in surrounding areas and other parts of the region.
After UDM determines the residential location of employed residents, it uses several local factors to derive household, housing stock, and population. UDM forecasts these local factors based on regional trends and other indicators including housing mix (i.e. single family units, multifamily units, and mobile home).
Figure 1: Key Components of UDM
Step 1: Employment Allocation (Civilian)
Employment allocation is the first model component. During this step, UMD determines the location of civilian employment based on the distribution of existing and previously forecasted employment in addition to considering opportunities for employment development throughout the region and travel times. UDM then estimates the civilian employment share for a particular ZUM rather than the actual number of jobs. In order to convert these employment shares to employment numbers, UDM multiplies the share by the civilian employment forecast for that ZUM. During this allocation process, the model also accounts for site-specific activities (projects that are known to be under construction and are an input to the model, rather than projected using forecasting techniques). UDM then determines the employment change in each ZUM by subtracting base year employment from the employment forecast.
UDM next allocates employment change to parcels based on their development priority, defined as accessibility to residential and civilian employment activities. At the parcel level, the model must consider what the planned land use of a parcel is. If the development is to occur on land that is designed as employment infill, no land use accounting is required. However, if the development will occur on land designated as mixed use, vacant, or something other than employment infill, the model must determine the number of acres changing land uses.
Step 2: Housing Allocation
In the next module, UDM determines the location of residential activity. The model distributes single family stock, multifamily stock, and mobile homes. In order to complete this step, the model relies on the forecast of employment, commuting probabilities and travel times, and opportunities for residential development based on capacity (where capacity is an estimate of the number and type of additional units that can reasonably be built on a parcel, given existing conditions, land use plans, and constraints to development such as steep slopes). UDM considers four major factors in forecasting residential activity.
- Residential activities are primarily determined by employment location.
- The longer the work trip, the less likely that a person will make that trip.
- The more land available for residential development, the greater potential for residential growth.
- Residential growth occurs only in area with capacity for residential growth.
To allocate housing at the ZUM level, UDM first determines the residential location of workers throughout the region. This step is repeated twice; once for workers in single family home and multi-family homes. The allocated workers are referred to as “employed residents”. Next, UDM determines the housing units needed to house the employed residents based on an employed residents per household rate. This ZUM-specific rate considers local unemployment rates, multiple-worker households, labor force participation rates, age, structure, and income. Using this rate, UDM determines the number of households needed to accommodate the forecast of employed residents. The ZUM forecast of households is a temporary output that is finalized in the last module of UDM. After determining the necessary number of housing units, UDM distributes the housing stock change to general land use categories for the single-family and multifamily allocation based on the land use types and capacities within a ZUM. In the last step of the module, UDM allocates mobile homes to ZUMS. The ZUM housing stock forecast for each structure type matches the ZUM forecast of total housing stock.
Similar to the parcel allocation for civilian employment, UDM allocates housing stock change to parcels based on their accessibility order (development priority). This is done separately for single family and multi-family units and their land use categories. To complete the housing stock forecast, UDM allocates the ZUM mobile home forecast to parcels. Similar to the land use accounting in civilian employment, the model then determines the acreage change in land use.
Step 3: Assigning socio-economic characteristics at the sub-regional level
UDM uses the civilian employment and housing stock forecasts as input for the allocation of the other characteristics including occupied units by structure type (single family, multifamily, and mobile homes), household and group quarters populations, employed residents, household income distribution, and civilian employment by industrial category. UDM produces ten household income categories, and fourteen civilian employment categories. It also provides a forecast of uniformed military employment. Changes in uniformed military employment are external to the model and are treated as site-specific activities.
ZUM occupied housing units are calculated by adding together units by structure type to get total occupied units. The same is true for MGRA occupied units. For both geographies, a vacancy rate is applied.
Population by Age, Sex, and Ethnicity Forecast (PASEF)
The program for forecasting detailed demographic characteristics (age, sex, and ethnicity - PASEF) is a demographic model designed to forecast detailed demographic characteristics at a neighborhood level. The detailed demographic forecast comes directly from DEFM, but requires aggregating the single year of age detail into the five-year age groups used in PASEF, and an adjustment for special populations. The model projects population for 18 five-year age groups (0-4, 5-9…,80-84, and 85+) broken down by gender and ethnicity for the region and smaller geographies.
The forecast technique accounts separately for special populations which include military and college population. Special populations are treated differently from non-special populations because their characteristics remain relatively stable over time. Therefore, while PASEF incorporates changes in the overall size of the special population, it assumes that their age, sex, and ethnicity profile remain unchanged over time. PASEF forecasts special populations using a bottom-up method whereby census tracts are forecast first, then aggregated to sub-regional areas (SRAs) and the region. The regional and SRA special population estimates are then used in the calculation of non-special population estimates.
The non-special population forecast uses a top-down method – first for the region, then for SRAs, next for census tracts, and finally for MGRAs, with the larger geographic areas serving as controls . PASEF derives the non-special demographic characteristics population for the region by subtracting the special population estimates. For purposes of controlling, PASEF also creates a regional non-special population estimate by sex and ethnic group.
A two step method provides the non-special population forecasts for SRAs. The first step computes the sex and ethnic composition, and the second step computes the age composition within each sex and ethnic group.
The sex and ethnic composition is based on the change in the sex and ethnic group shares for each SRA and is an exogenous input based on historical trends. These trends are derived from Census information and SANDAG’s latest detailed demographic characteristic estimates. PASEF next computes the non-special population by age within each sex and ethnic group.
During the second step, PASEF controls the age forecast within each sex and ethnic group to the non-special demographic characteristic forecast for the SRA.
For census tracts classified as non-special, PASEF starts with the initial demographic characteristics estimates previously developed, and for controlling purposes, creates a non-special population estimate by sex and ethnic group. Next, PASEF uses the same 2-step controlling method used for SRAs to derive the sex and ethnicity estimates.
The final stage in PASEF distributes the demographic characteristics estimates from the census tracts to the MGRAs. The model assumes that each MGRA has the same demographic characteristic distribution as the census tract in which it lies.
Validation and Calibration
The demographic, economic, and land use forecasts are developed in a collaborative process. SANDAG staff works closely with a wide range of professionals outside the agency when preparing forecasts. For the regional forecast (DEFM), SANDAG convenes a Regionwide Forecast Technical Advisory Working Group, which is composed of experts in demography, housing, economics, and other disciplines from state and local agencies, local universities, and the private sector. This committee is responsible for reviewing the regional model structure, data inputs, and assumptions. Feedback from the committee is incorporated into the model. The committee also evaluates the forecast results. With the DEFM forecasts, SANDAG has a track record of less than 0.5 percent error, on average, per forecast year.
SANDAG also relies on the Regional Planning Technical Working Group for advice on the forecast, which provides information for jurisdictions, communities and other areas within the region. This working group comprises the local jurisdictions’ planning directors or their designees and representatives from other agencies within the region that use the forecast data for facility and infrastructure planning. This working group assists with local land use assumptions that are among the most important inputs to the forecasting process.
The DEFM model relies upon proprietary software, MetrixND, licensed from ITRON. IRCM is a spreadsheet-based model. Software that implements UDM and PASEF has been developed in C#. Tables with model data were created and stored in Microsoft SQL Server databases.
These models require a wealth of data from a variety of sources. These sources are outlined in the table below:
|Housing||U.S. Census Bureau, San Diego County Assessor, local jurisdictions||DEFM, UDM|
|Jobs (by industry)||U.S. Bureau of Labor Statistics, California Employment Development Department, U.S. Department of Defense, local jurisdictions||DEFM, UDM|
|Labor market (employment, unemployment, labor force participation)||U.S. Bureau of Labor Statistics||DEFM|
|Population and demographic characteristics||U.S. Census Bureau, California Department of Finance||DEFM, UDM, PASEF|
|Price levels and inflation||U.S. Bureau of Labor Statistics, National Association of Realtors, DataQuick Information Systems||DEFM, IRCM|
|Public finance||California Department of Finance||DEFM|
|Travel times||SANDAG transportation model||IRCM, UDM|
|United States projections||U.S. Census Bureau, and economic projections purchased from private-sector vendor (varies depending on series)||DEFM|
|Vital records (births, deaths)||California Department of Health||DEFM|