Identifying Better Ways to Make Local Data a Bigger Part of Solving America’s Housing Challenges
Small- and medium-sized multifamily housing (SMMF) remains a crucial source of affordable housing. SMMF--properties with two to 49 units--accounts for 21 percent of all housing units and 54 percent of all rental homes nationwide. Understanding why those properties tend to be affordable and how best to preserve their affordability relies heavily on local—often messy—data.
Last week, SMMF was a focus of a convening Enterprise Community Partners’ Policy Development and Research team hosted on local data¬–driven approaches to addressing affordable housing needs. Attending the event, which was supported by JP Morgan Chase, were leading researchers from academic and practitioner organizations who are collecting, organizing, and analyzing parcel-level data to address local needs related to the ownership of these properties.
The objective of the day-long meeting was to share examples of and strategies for conducting this data-intensive work and support the preservation of affordable homes. The gathering included demonstrations and discussions of software and technology infrastructure for data cleaning, sorting, mapping, and managing large datasets and of tools for community engagement (for both collecting and disseminating data), as well as how research findings can inform policies and programs. By not only sharing findings but also engaging on the technical steps necessary to conduct the research, participants advanced a conversation whose goal is developing common strategies and best practices that can be replicated by others interested in developing data-driven solutions to increasing the supply of affordable housing.
The day began with an overview of the SMMF stock. It makes up more than one-fifth of all housing in suburban and rural communities and more than half of all subsidized housing in the U.S. With both lower average rents and renter incomes relative to other rental housing, units in SMMF properties are an important source of affordable rental housing for lower-income households. Based on an analysis presented at the convening by Anthony Orlando of Cal State-Pomona and Seva Rodnyansky of UC Berkeley, just over half of affordable units observed in 1991 remained affordable ten years later. Identifying and preserving these units is thus vital to maintaining adequate supply of affordable housing options over time.
The challenges in the identification process are numerous, ranging from issues with data quality and access to the complexity and intensity of the analysis required. Fortunately, several researchers are working on solutions to overcoming these barriers in their own work that incorporates local data-driven approaches to addressing affordable housing needs.
Presentations during the convening summarized a few of these efforts:
- Orlando and Rodnyansky demonstrated OpenRefine software, which uses text and mnemonic matching algorithms to identify commonalities among property records obtained from public assessors’ data. The functionality offered by OpenRefine allowed Orlando and Rodnyansky to identify recent high-volume buyers of residential properties in communities in Florida and Georgia, and local governments could partner with those buyers to develop affordable housing solutions in neighborhoods with high risk of lost affordability or displacement.
- April Urban of Case Western Reserve University’s Center on Urban Poverty and Community Development presented on an integrated property data system being developed in Cleveland that combines administrative records and user-provided data into a comprehensive database that can inform neighborhood improvement strategies. For example, a recent collaboration with Lead Safe Cleveland used customized geocoding processes to link property-level information with information on lead poisoning incidents. Lead Safe Cleveland’s goal is to identify owners of rental properties with potential lead paint contamination and to improve outreach and mitigation options for removing lead paint from older rental units.
- Sarah Duda of the Institute for Housing Studies at DePaul University described an effort underway in Chicago to identify communities at risk of displacement in neighborhoods with rapidly rising housing costs. Once completed, this data could be used to develop preservation strategies for working with long-term owners of 2-4 unit properties in these neighborhoods.
- Amanda Meng from Georgia Tech’s School of Computer Science discussed two recent examples of data-driven housing analyses in Atlanta. One was a study of an anti-displacement tax abatement program, which used both property- and neighborhood-level data to estimate eligibility for and cost of the program. The other was a resident-led effort to identify and report housing code violations and link them to property owner records.
- Henry Gomory from Princeton University’s Eviction Lab described his study of property and landlord characteristics in Boston. Through multi-source data matching, he identified the owners with the highest rates of evictions and problem properties, including those operating through conglomerates or other corporate structures.
- The final presentations of the day highlighted new technical tools and best practices for integrating and using large datasets. Enterprise’s Andrew Jakabovics demonstrated an extension to OpenRefine that allows pattern matching across datasets to show how a list of addresses can be connected to administrative data to allow the administrative data to be appended to the original data. Kelly McElwain shared the process used by the Public and Affordable Housing Research Corporation (PAHRC) to create and maintain the National Housing Preservation Database, which consolidates multiple administrative datasets to track the status of federally assisted rental housing across the country.
Throughout the course of the day, several common themes emerged. There was broad recognition that local data can be a powerful resource for developing policy and programmatic strategies to address preservation of affordable housing, but much of that data can be idiosyncratic. Working with local data not developed for research can be challenging, but the day highlighted ways in which new technologies can greatly simplify and/or automate part of the process of preparing that data for analysis. Although specific datasets likely do not cover multiple jurisdictions, making it hard to directly extend analyses to new geographies, the kinds of datasets researchers may use have many common attributes. Thus, standardization and documentation of data processing strategies is essential for creating tools that can be replicated in additional locations or in future years. Finally, there was great appreciation for the value of sharing data cleaning, analytic, and management tools so that future research can more quickly develop clean datasets and thus devote more time to analysis and community engagement.
Enterprise Community Partners is committed to expanding the supply of affordable housing because of the benefits it brings to individuals and communities. Tools and strategies like those discussed at the convening are one way we are working to support the preservation of affordable housing options in local communities.