On May 8th, fifty urban planners, data scientists, and civic hackers came together for the Department of City Planning’s first data jam!
For the day, participants improved NYC’s most comprehensive dataset of facilities. This facilities database is produced by the NYC Department of City Planning and is used to shape NYC’s neighborhoods. It captures a wide range of sites and services, including education, healthcare, social services, parks, cultural institutions, transportation, and industrial uses. Mapping this footprint of facilities and programs within NYC is essential for planners to understand the distribution of public services, strategically site new facilities, and inform capital investment decisions.
Throughout the day, small groups of ten tackled several critical data quality challenges, like duplicate records and unreliable addresses, that have limited the planning analysis that can be conducted using the data.
The data jam participants explored and presented different methods for addressing these difficult data quality challenges and brainstormed frameworks for strategically siting new services and engaging with the public to make this information more accessible and easy to understand. This data jam was co-hosted by BetaNYC, the American Planning Association (APA), and NYC Planning as a part of the 2017 APA National Conference in NYC, with the goal of facilitating connections between urban planners and the civic technology community to generate new, creative ideas for solving urban challenges with advanced technology.
“Really appreciated the work to pre-configure the data jam with specific but open-ended questions. Lots of cool things possible around FacDB to serve vulnerable populations imho.” – Varun from Argo Labs
Read more about Varun’s experience on Argo Labs’ blog.
Event Challenges & Team Notes
- Challenge 1 – How can the rate and accuracy of identifying duplicate records be improved? (Link)
- Challenge 2 – How can sites that are administrative locations rather service locations be identified in FacDB? (Link)
- Challenge 3 – How can complete BBL and BIN info be gathered for large facilities, such as campuses that include multiple buildings and tax lots, by hacking together different datasets? (Link)
- Challenge 4 – How can the FacDB database architecture and maintenance process be improved and in order to streamline updates and improve data quality? (Discussion)
- Challenge 5 – How can DCP and other City agencies use FacDB to inform the co-location of compatible facilities? (Link)
- Challenge 6 – What tools and analysis can be developed using the FacDB to better empower communities to make their needs known and become more informed? (Link)
- The data jam was successful for DCP’s public outreach and for building relationships with members of the civic tech community who care about DCP’s work.
- Both DCP facilitators and participants learned about new analysis techniques and developer tools/packages that can be applied to their work.
- The teams’ work provided a few immediately actionable improvements that could be made to the database, e.g. obvious duplicates that were quick fixes, but mainly provided ideas for other opportunities to dig into further, including:
– New facility categories and datasets to include in the database
– Fresh perspective and questions around “what is a facility?” that will guide future planning and approaches
– Perspective on additional use cases for the data
– Machine learning deduping package to test out
“[…] just wanted to say thanks for organizing the event today – very inspiring to see DCP doing such awesome work. I was the person that presented from Challenge #4 about DB architecture. Had a great time discussing the options, and would love to continue to be involved if there is help needed. Also, was very happy to learn about the PostgreSQL Hash method of tracking changes in data- way cool!”
74 Applicants / 25 Waitlist
15 Staff / Male – 6 / Female – 9
40 Accepted Applicants / Male – 21 / Female – 19
50 Event Participants / Male – 25 / Female – 25
Photos from the day