This article was originally published on Government Technology.
Your city generates a lot of data — data about your citizens, your businesses, your roads, the infrastructure that keeps your city livable, and more. Turning this data into useful information requires the proper capture, storage and accessibility of metadata (data about data).
Think about the data related to the thousands of incident reports filed by your police department each month. Metadata associated with these incidents allows you to see and identify patterns of where crimes are happening, how quickly it takes officers to respond, what equipment they turn on, etc. This results in a city’s ability to efficiently deploy and train its officers, improving the safety of its citizens.
Metadata can also reveal on-the-ground outcomes of a city’s planning and policy choices. It can be analyzed to help city leaders predict population growth and anticipate where additional resources — such as utilities, after-school programs and parks — will be needed. It can help untie, or at least loosen, the Gordian knot of rush-hour traffic by analyzing patterns and helping city leaders make decisions regarding when to increase public transportation or improve and expand roads, highways and freeways.
My company recently partnered with our local police department, the Austin PD, to organize and structure its data storage for dash camera videos. Merely recording gigabytes of video each day isn’t terribly helpful, but when it’s properly indexed and stored for later search and retrieval, the videos’ metadata becomes extremely powerful.
As useful as metadata is, however, using it isn’t always as easy as it might seem.
Challenges with metadata
Storing huge amounts of data indefinitely and tagging it with metadata is an investment. To make good use of data, cities need to commit the time and money to properly capture and store it. This financial and resource commitment is growing each year as the volume of data that cities collect grows. Some estimates put the data growth rate at more than 40 percent annually.
Much of this data is stored in silos, which makes it more cumbersome to access and more vulnerable to data loss. Often, metadata is a game of catchup. Trying to apply structure to unstructured data that’s stranded across several network-attached storage servers in various locations can be a huge undertaking without searchable storage.
But keeping that data safe when it is searchable is a challenge as well — especially considering the ongoing attacks against government computer systems, like the one experienced by the U.S. Office of Personnel Management just last year. Personally identifiable information, like Social Security numbers and information pertaining to ongoing criminal investigations, has to be legally protected and hidden — but also accessible to law enforcement and other governmental entities.
Adopting a storage solution that works for you will be the key to taking on the data challenges of the future. However, once you find the right solution, the uses for that data are infinite when it’s properly stored with metadata.
Metadata as it stands
The mayor of San Francisco launched an innovative project, DataSF, that gives the public access to swaths of searchable government data. The idea is to help businesses and citizens make better, more informed choices and — by being more transparent — increase mutual accountability.
For instance, one outcome of the new data-sharing project is a partnership with Yelp, the crowd-sourced review site. The application gives users access to the health inspection records for the eateries listed in its popular restaurant reviews. The outcome is that more people are able to avoid unsanitary eating conditions, and restaurants have more incentive to follow city health ordinances.
Likewise, Los Angeles recently started a program called CleanStat, which is a database that measures the amount of garbage on city streets. Inspectors examine and grade every street and alley throughout the city. The city supplements these reports with geocoded video footage taken from garbage trucks.
For a city that covers nearly 5,000 square miles, guessing at how to combat the trash problem is clearly the best way to ensure the next generation of residents will still be dealing with trash on the streets. On the other hand, getting a comprehensive survey of current conditions and then storing that data with relevant metadata that can be analyzed to develop a strategic solution is the best way to actually fix the problem.
Metadata helped Chicago fight its rat problem. Cell phone metadata helps Italy fight organized crime by revealing who influencers are, such as the people at the centers of webs of phone calls. And Boston is using aggregated data from Uber to better understand traffic flow and guide traffic engineers in timing lights and modifying intersections. The application of search and analysis using metadata is limited only by the quality and searchability of the data and the strength of the imagination.
The future of metadata
As the investment in big data continues to grow, the demand for results from analyzing that data will grow with it. Cities are no exception. They need searchable storage with accurate metadata to retrieve relevant data from the mountain of information they have stored.
That mountain will continue to grow, so it’s better to organize each pebble as it’s created and make it useful to the city now rather than throw it onto the pile and sort it later. Don’t leave your data unusable by haphazardly storing it.
Data storage becomes even more important as cities move toward predictive analysis, arguably one of the hottest topics surrounding data today. While data is about what happened in the past, used holistically, it can illustrate trends that predict the future. Parsing data for these trends is impossible if the data is stored in silos with no way to consolidate and search on key elements.
Storing your data with metadata is worth the investment. In today’s era of limited revenue and tough trade-offs, cities need good information and insight to make effective management and investment decisions. They need just the kind of insight that metadata can unlock.
City of Austin switches to Swarm Object Storage after exceeding 1.5 PB, stressing the limits of their scale-out NAS solution. More Details »