Tag Archives: linked open data

I’ve opened my Government data – What next?

In my previous blog, I published my views about how governments can (and should) conduct due diligence in identifying the relevant datasets from their blackbox databases and open them up. This included identifying attributes of open data and mapping the data to a 5-star rating. Having touched on the significant potential that Open Government Data holds in the evolution of the Smart City ecosystem, I paused my thoughts with a question –

Is the Open Government Data story complete once government entities have made data available in 5-star format (the best possible format)? Or would you say the story has only started on a strong footing?

The answer to that question is fairly obvious to anyone who understands the Supply-Demand equation of any transaction – . However, what is important to note in the Open Government Data context is that the demand side of the equation involves substantial dynamics. There are 2 very critical aspects of demand – drive consumption and, more importantly, drive value-generation from open government data. The illustration below captures this journey succinctly.

 Three stages of Open Govt Data journey

The rest of this blog will detail what it takes to drive the consumption story for open government data.

Driving Data ConsumptionOpen Govt data journey - Stage 2

In a simplistic view, all this requires is for government entities to make a few commitments and honor them at all times.

  • Commitment to provide fresh data at all times
  • Commitment to bind the data service with a Service Level Agreement (SLA)
  • Commitment to maintain data quality at all times
  • Commitment to ensure data anonymity

The key to data consumption is to provide the Data Consumer sufficient evidence to instill trust in the relationship. As is observed with any relationship, an honored commitment is the best way to drive this. Hence, it becomes essential for Data Provider agencies to step lock with consumers at all times. One of the key concerns most consumers have is the governments usually are high-handed and set the rules of the game. Here is a scenario – A flourishing start-up has built a rich mobile app and open API for a service that brings together datasets from 3 different government agencies and combines that with data gathered from 2 other private entities. The app sources government data from an open data portal hosted by the government. The app has been in the market for about a year and has seen a good uptake because of the uniqueness of the service it offers. The start-up has been making healthy revenues through the mobile app and the open API that renders this service. One of the government agencies has done an internal study and has put in a regulation which restricts the contours of data that is shared outside the government. Following this, what if the agency decides that –

  • A certain dataset that was being used by the start-up will not be made available from the next quarter
  • The dataset refresh will be done only once every quarter instead of monthly
  • The nature/quality of data shared will change from the next refresh onwards
  • The dataset access will be blocked completely with immediate effect

The flourishing start-up will have no choice but to rework their innovative service around these new changes, provided that is feasible and practical. Unlike a B2B relationship where both parties have almost equal say, a G2B relationship is steered by one party – the Government.

It is to be recognized that the lack of transparency has an adverse impact on the public trust in the objectives and motives of the government. Open Data consumption is not a one-time task but a continual process that requires objective commitment levels from the data source entity over an extended period of time to gain the confidence of data consumers. It is time Governments get the balance right in the G2B relationship – as an example, they should come out with clear SLAs that govern the relationship. This is a common practice in any B2B and B2C relationships.

Another area that the governments need to work on is to influence and create the perception that they are doing enough to protect individuals’ rights to privacy and confidentiality of the data held by them. The last thing a data consumer would want is to be entangled in legal issues because the data was not anonymized1 or pseudonymised2 adequately/accurately at the source. Governments should be able to confidently state that the data is anonymized to an extent that it rules out any chances of a reconstruction through the Mosaic Effect3.

Generating value from Open DataOpen Govt data journey - Stage 3

Once the trust between the Data Source entities and Data consumers has been established, it is mostly up to the data consumers to tap into the data and generate value that was unseen for various reasons. More often, the value generation comes from the fact that the data consumers are able to correlate various datasets – government data, private data, and proprietary data – and render use cases that wouldn’t be possible otherwise. Having said that, the governments can still play a substantial role to positively influence the larger ecosystem.

As an example, in one of my earlier posts I had mentioned about the 5-star rating by Tim Berners Lee. One of the key aspects of open government data – ranging from 1-star to 5-star – is that the data consumer agency should be able to further license the data without restrictions on use as part of the public domain. Public data should be released such that it enables free re-use, including commercial re-use. The possibility to distribute data without restrictions will encourage consumption and generate new avenues in the city ecosystem to leverage the intrinsic value of open government data. This will spur further innovation.

Another way that Governments can play an active role in encouraging the community to innovate based on open data is by ensuring that datasets of real value are being made available. While the government may have thought through the data that can be opened up, it is only at the consumption stage that the lacunae in the nature or quality of data becomes apparent. Governments should establish a mechanism by which the consumers can submit their concerns about existing datasets or place requests for more relevant datasets. The government will be able to know the pulse of the consumer community only when such a closed loop exists. At the end of the day, the value of open government data is only realized when then data consumers can generate experiences (through mobile apps, open APIs et al) that enhance the living experience of the residents.

In conclusion, Governments have an active role to play all through the Open Data journey – from data identification to value-generation. Most governments consider their job done once the data is made available on the Open Data platform. As mentioned above, that is just half the job and will serve a minimal purpose without a focus on data consumption side. With large initiatives of this nature, it is essential to keep receiving encouraging signs for the government entities to stay engaged and for the Open Data initiative to sustain over a long tenure. Hence, it becomes essential to ensure that the data consumers are also constantly engaged and their expectations are reasonably met. The need is to establish an ecosystem where all stakeholders participate and play their role towards delivering an enhanced living experience.


1Anonymised Data – Data relating to a specific individual where the identifiers have been removed to prevent identification of that individual.
 2Pseudonymised Data – Data relating to a specific individual where the identifiers have been replaced by artificial identifiers to prevent identification of the individual.
 3Mosaic Effect – The process of combining anonymized data with auxiliary data in order to reconstruct identifiers linking data to the individual it relates to.

SMART CITY needs SMART DATA needs SMART GOVERNMENT

The most significant outcome of a smart city (and the key indicator) is to provide citizens of the city alternatives and opportunities to lead a better life. This could be in the form of efficient and effective public transportation, proactive traffic monitoring and easing, automated monitoring of utility services, weather management, emergency management, public safety and more importantly an amalgam of these services through correlations. Each of these Smart City services (and please note that the above list is not exhaustive) is data-intensive and results in reams and reams of real-time data, that when leveraged can generate meaningful insights, further driving an enhanced experience for all city stakeholders.

While City agencies and governments worldwide have been spending effort through various initiatives (Ex: Share-PSI) to tap into this data and generate value, they are also limited by the resources (time, money, labor) at their disposal. What if the reams of data generated through the city/government initiatives are made available to private entities and general public, at large. Of course, this needs a careful scrutiny of what data can be shared beyond the boundaries/firewalls of the agencies. However, that should be a small hurdle to overcome considering the immense potential of the data that will be tapped into by these external stakeholders further enhancing the city ecosystem. This needs governments to open up – open up between themselves and open up to external world. This needs Open Government Data.

In an earlier blog, I had highlighted how Government data can be used in different contexts – Government to Government (G2G), Government to Business (G2B) and Government to Citizen (G2C). The progression to Open Government Data needs a methodical approach and ideally takes the following transition path. Open Data graphic - TransitionEach government agency needs to scrutinize its data to identify datasets that is sought by other agencies and identifying non-sensitive datasets that can be opened up between each other. A further level of scrutiny is required to identify the subset of data that can be exposed to non-government city stakeholders (private entities, general public).

However, not all data that can potentially be opened up will be really helpful. Some of the data may be in a very crude form and will not help the data consumers since they cannot leverage this without extensive effort and investment. For example, scanned (anonymized) application forms are of little value until the data is actually digitized through some OCR mechanism or manually. This discourages the consumer (more specifically, the technical community) to tap into the data even if it is made available. During this era of devops and agile, the idea with Open data is to provision datasets that can be easily tapped into and generate value quickly and with ease. So, how does one identify high value data sets – data that is smart by default?

What does Smart Data mean?

While there cannot be a binary method of identifying Smart data, some very detailed parameters have evolved from the discussions at The Open Group. One such discussion has arrived at the following 9 dimensions of quality that should be applied to data:Attributes of Good open dataWhile these 9 quality parameters are important, one needs to look into the specific business requirement and the corresponding datasets to assign weightage factors to each of these parameters suiting the context. It is also to be noted that each parameter will have further level of detail that has to be studied before declaring it be of high quality. For example, is Credibility defined only by the trustworthiness of sources – what if the data has undergone some transformation in the interim before being made available?

Another example – the Processability parameter mentioned above can also be studied further using the 5-star-data definition provided by Tim Berners-Lee. 5-star rating of Open dataMost government agencies will have a mix of these different segments of rated data with a heavy leaning towards one-star and two-star data. While one-star and two-star data is fairly easy to generate, this limits data usage on the consumers’ side, when exposed and made available as Open Data. Generally, there are very few consumers willing to invest and/or competent enough to refine the provider data further to make it more consumable. And hence, the uptake of this kind of data will be low. Provider agencies will need to invest in progressing further on the maturity roadmap – make data non-proprietary, add semantics and link to related data/content. More importantly, they should adopt these new methods for all data generated till date and in the future. As a data provider agency progresses on this maturity roadmap, it will start seeing a corresponding adoption and value-generation from the larger city ecosystem. It is to be noted that the progression towards 5-star data will involve a change in organization practices and culture but once that becomes business-as-usual, the effort required is fairly low compared to the uptake one gets to see on the consumer-end.Effort - Provider vs Consumer

How can governments be smart?

Most governments worldwide have opened up to the idea of Open data and the ones who have not will only delay but eventually get there. The question is no longer whether government agencies will open their data, it is when and how will they open their data. It requires strategic planning by the governments to execute initiatives of this nature and drive collaborative execution of the same across agencies. Substantial focus on adoption enablement to ensure governance and adherence to standards is essential.

Exchanging data between agencies does not come naturally to most government organizations and when they do share data, they rely on very manual or archaic methods – paper-based, phone requests, email requests etc. Initially, the agencies have to move to an operating model where data is made available on a data exchange platform through a single window (Ex: a portal). Data can be requested and procured through the same window – either in real-time or in batch mode depending on the nature of the request. At minimum, this will ease government operations and make them more effective and efficient. Also, it makes life easy for the citizen so that he/she does not have to share the same data multiple times with different agencies.

This is best implemented by encapsulating the data sharing services as APIs since it can potentially foster further innovation within the government ecosystem.

Once the data has been opened up between agencies, it makes it relatively easy to progress to share the non-sensitive data with non-government stakeholders. The API-approach can be leveraged further to encourage innovation in the digital economy.Open Government data - Progression pathThe next level of progression will be to linked open government data (LOGD) and use this as a revenue stream. LOGD can demonstrate value in a wide range of use cases that were not thought of earlier. As an example, imagine the impact of accessing real-time public transport services data (from the Transportation department) to an event in the city (organized by Tourism department) that links up with the weather data (gathered from Meteorological department) and helps the citizen plan their journey.

Governments need to take up planned initiatives to tap into the potential of locked up data. The data needs to be pruned and polished to make it more relevant and ease consumption. This data, once tapped into by the city ecosystem, can be applied in daily-life scenarios that impact the community and thereby, deliver a signature city experience. The possibilities are immense. All that is required is to take the initiative and tap into the value of the new natural resource – data. The sooner the better.

Closing thought – Is the Open Government Data story complete once government entities have made data available in 5-star format (the best possible format)? Or would you say the story has only started on a strong footing? There is a lot more to follow…