BetaNYC’s testimony to NY City Council Committee on Technology – Open Data oversight hearing 1 Oct 2015

Dear Chairman Vacca and Councilmembers,

New York City’s open data ecosystem one of the world’s best. We are very sorry that BetaNYC’s leadership can’t be there in person.

First, BetaNYC and NYC open data community has experienced an amazing 18 months. We are excited to see the Council and this administration commit to making open data work for all.

Our community’s explosive growth is a testimony to the success of the City’s open data program. We are excited to work with the the Council, the Mayor’s Office, and Agencies improve access to information and build a government for the people, by the people, for the 21st century.

This testimony is broken into three parts:

A recap of BetaNYC’s partnership with NYC open data ecosystem.
An introduction to NYC’s Civic Innovation Fellowship, a program to strengthen open data use within government.
Methods to strengthen our open data ecosystem.

As of this month, BetaNYC’s community 3,000 members strong.

We continue to offer weekly open data programing to all who want to learn.

In February, we partnered with the Mayor’s Office of Data Analytics (MODA) to host our third winter hackathon, Code Across NYC. At this event, over 600 people attended to learn about the City’s open data program and to use data, technology, and design to improve their communities.

In August, Department of Citywide Administrative Services (DCAS) unveiled an updated City Record with improved events data that allows you to subscribe to local public hearing notifications. Throughout that process, we worked to educate DCAS and the Mayor’s Office of Technology and Innovation (MOTI) on the value of detailed location data. We hope to continue this partnership, and continuously improve the City Record’s data.

Currently, we are a part of NYC BigApps–providing a community data platform and evening workspace for projects to develop.

Community tools that power NYC’s open data ecosystem

Last October, we launched Citygram.NYC. Over the last year, we have improved the program where you can subscribe to 311 service alerts, vehicle collisions, and restaurant inspections.

http://www.citygram.nyc

In the Spring, we launched two community support tools.

First, we launched NYC’s community data portal – Data.Beta.NYC. It is a community data portal designed for NYC by NYC. We have a five person, volunteer team maintaining 112 datasets and assisting 19 organizations.

http://data.beta.nyc

Data.beta.nyc is an example how agencies can inexpensively run their own data portals. Additionally, we think we have solve the perplexing issue of conversations around datasets.

https://talk.beta.nyc

In complement to data.beta.nyc, we launched talk.Beta.NYC – online home for NYC’s civic tech community. “Talk” is the central clearing house for questions on the city’s open data, civic tech events, and thematic workgroups. On average we have 400 monthly users.

…this is just the beginning.

Introducing NYC’s Civic Innovation Fellowship

In partnership with Manhattan Borough President’s office, we’ve launched the NYC Civic Innovation Fellowship (CIF). This program is building the next generation of civic hackers, policy wonks, and hopefully a few City Council members.

We are creating new working relationships between NYC Community Boards and the next generation of community leaders through training and employment in human-centered, data-driven decision making.

Our objectives are three fold:

Community Boards – Build capacity through digital civic literacy and mapping, planning, and improvements tools.

CUNY Service Fellows – Educate and empower a new generation of community leaders who know how to use civic design, technology, and open data.

Borough Presidents’ Offices – To train & support Community Boards to better communicate and engage with their constituents.

In the short term (next 5 years), a successful outcome of the CIF program would be the establishment of a basic, effective curriculum and the solidification of the relationship between Community Boards and the Service Corps program. These accomplishments would improve the experiences of both youth and Community Board members as well as increase the NYC’s commitment to (and implementation of) open data quality & standards. After two successful cycles completed for Manhattan Community Boards, the program will reach out to the other 4 Borough President’s offices with the eventual goal of establishing the program for all 5.

In the long term (5-10 years), the goal of the project is to expand the number of participating youth and Community Boards. Ultimately the program should be able to accommodate Community Boards from all five boroughs, as well as have the resources in place to serve City Council member offices, as well as selected New York State Senate and Assembly offices in the same fashion.

Opportunities to strengthen NYC’s open data practice

BetaNYC’s future depends on a mature open data program. Our ideas are not directly in conflict with any of these pieces of legislation. We can see that the Council has listened to our observations and worked to inshrine them into law. We are thankful and grateful for having an attentive City Council.

In general, we see the underlying message behind these bills as a much needed adjustment to the City’s existing open data law. With a shared frustration, we too want a more transparent, accessible, and accountable open data program.

Here are our recommendations to improve our shared frustrations.

Into 0916-2015 – Compliance audit

We would love to have an oversight body that doesn’t cherry pick oversight. We need a body who’s mission is to fully engage with the community of data users and provide detailed assessments of how and where improvements can be made.

This might be an opportunity to invigorate the Commission on Public Information and Communication (COPIC) who has a mandate to oversee the City’s public information.

Int 0915-2015 – Updating datasets

In a paper world, three day notifications are great. In the digital world, notifications are instant. If agencies were resourced to run their own open data portals, we believe we would receive higher quality data and notifications. By developing an ecosystem of data portals, there would be no need to “send data” to the city’s single open data portal.

We believe that the City can maintain efficient open data management tools. There are many open source data management tools that are operated by small teams. These teams collaborate with each other to maximize feature development.

Fundamentally, agencies need to be responsible for their own data. We feel that empowering agencies to own and manage their own open data repositories will give the public better access to data handlers and improving data quality.

We encourage the City to explore the use of open source data management tools to ensure that data quality is increased and agency empowerment is maximized.

Int 0890-2015 – Archiving of datasets

This proposed legislation is exciting but offers two questions.

To what extend to we need archival data?
Why don’t we consider the technical standard’s manual a living document?
To what extend to we need archival data?

In an ideal world, we copy every bit and have them behind a version controlled, application programming interface. This would provide an ideal time machine!

Our current experience indicates our current open data practice is really bad at time travel. We have no data dictionaries, a variety of formats, and at times conflicting data. Not only do we share the desire for historical data, we are passionate about agencies having the internal capacity to improve their data quality.

Our City’s open data practice needs to expand their technical resources to own and maintain a variety of data ideals. We need additional talent to help agencies liberate data, consider their long term use, and ensure data proliferation.

For the last two years, we have encouraged the City to join us in an open conversation to update the technical standards manual. We have always seen the standards manual as a living document.

We always thought having the public help update the standards manual was part of the plan. The current legislation doesn’t ensure the standards manual is updated with public participation.

We agree that the open standards manual needs to be updated, and the public should be included. To strengthen this legislation, public engagement needs to be enshrined in the City’s open data law.

Int 0898-2015 – plain language data dictionary

We absolutely agree that every dataset needs a data dictionary. This issue brings up a fundamental issue of who is maintaining our data formats, data quality, and what is the long term plan to empower agencies to maintain their own data?

This bill makes it clear we need more communication around datasets, and we’re not sure that singling out additional data dictionaries is the appropriate solution.

First, we do need data dictionaries. Second, we need conduits to data maintainers and subject matter experts to improve errors. We created talk.beta.nyc as a home to help address this issue.

We hope to work with the Council and the Mayor’s office to help improve access our City’s datasets maintainers.

Int 0900-2015 – standardizing addresses

We think that this issue is closely related improving the technical standards manual. We shouldn’t have to legislate specific data formats–addresses included. Then again, some of our most frustrating conversations around the City Record’s data were around address data standardization.

While we feel uncomfortable mandating address location data, we do feel that all addresses need to be as detailed as possible. As address are added to the City’s data portal, they should be converted to human readable formats. Every agency should maximize the City’s own geocoder.

Lastly, we need agencies to own this issue. We can not depend on DOITT to be responsible for all data translation issues. We need participating agencies to be empowered to improve their own data. Addresses are just one small part of the data ownership issue.

Int 0908-2015 – sharing FOIL data

We believe that all FOIL’ed data should be made available and the City should default to open. This bill expresses that intent, but is vague around prioritization. Additionally, it doesn’t address how those datasets would be kept up to date, nor how FOIL’ed records would impact the release of the larger dataset.

Int 0914-2015 – responding to dataset requests

We are in absolute agreement around dataset request transparency and building platforms for public <> government engagement.

Looking at the City’s own data, there are close to 180 datasets requests. Six have been approved; seven rejected.

From our experience, we need more agency engagement. We need conversations with the dataset’s owner, identify the important dataset parts, and have an ongoing conversations about data quality. We don’t always need more data, we need better data.

Conclusion

These are very exciting times. New York City has one of the World’s best open data ecosystems. To have so many Councilmembers actively interested in improving NYC’s open data ecosystem is important. We can not build the future without you.

While we support the intent of these bills, we are not sure all of these bills need to move forward. We encourage the Council to act as a broker and bring together dataset users and agency maintainers to address our issues.

In conclusion, we thank the Council, the Administration, and this Committee’s leadership in bringing us where we are today. Your leadership is indispensable in building an open data ecosystem for all.

If you have feedback or comments please leave them on talk.beta.nyc