The devil in data
Data is the new oil is a currently popular aphorism. To the extent that it connotes a resource of great value, there is some validity to the analogy. There is, though, one significant difference: one is composed of atoms and the other of bits. Giving away the first deprives you of it; but, when you “give away” data to someone else, it still remains with you. When a company exports a million tonnes of iron ore, that ore is gone from India; however, if an organisation chooses to export some of its data, the very same data can remain here too. In the shorthand world of today, it is important to not get carried away by the analogy, and recognise this difference between atoms and bits, between oil and data.
Growing digitalisation means that there is now a great deal of data about individuals, about you and me. Much of this is personal — bank account details, health records, passwords, and the like — and this important information needs to be safeguarded from wrongful access or misuse. Countries around the world are, therefore, putting in place laws for the protection of data. These laws include the extent, purpose and use of data, and the responsibility of those who collect and store it. The European General Data Protection Regulations (GDPR) are amongst the most comprehensive, and serve as a model for others. In India, the government has been working on a similar law for data protection and privacy. A committee, headed by Justice B.N. Srikrishna, has prepared a draft after extensive consultations. A law based on this is expected to be tabled for parliamentary approval very soon, and is a declared priority of the new government.
A law to protect data and ensure its privacy is certainly welcome and, in fact, overdue. However, to the extent it constrains the use and sharing of data, it will have an impact on business and on customer service. Today, many companies process and analyse data to get insights about an individual’s tastes, preferences, lifestyle, etc., in order to design new products or offer the most appropriate one. A bank, for example, might analyse all your transaction data and financial position to tailor-make a loan package — optimising the amount, duration and interest rate — specifically suitable for your present and future financial situation. It could also create an investment portfolio that matches your needs. Similarly, based on your past searches and other data, analytics helps companies to predict and cater to your likely interests. No surprise, then, as to how ads for hotels in Singapore pop up on your screen soon after you have searched for alternative flights to that destination. Useful service or an invasion of your privacy? You decide.
Such analytics requires great amounts of data; the larger the data-set, the more accurate the analysis. Thus, in healthcare, analysis of stored patient records helps to determine the appropriate medication for a new patient based on his/her health and genetic, demographic and other parameters. In areas like oncology, automated diagnosis and prescription (by IBM’s Watson, for example) is generally as or more accurate than that by specialists. In agriculture, accurate crop yield predictions are now possible because models have been developed using historical data of various parameters and correlating them with actual yield figures. In all these cases, accuracy of prediction depends on the amount of data available.
Such models use a combination of data analytics, artificial intelligence (AI) and machine learning, requiring a variety of high-end skills and sophisticated or large computing power (e.g., climate modelling or simulating a nuclear explosion require super-computers). To capitalise on the new technologies, India needs to develop its human capital and invest heavily in R&D in these fields.
This need for data links another aspect of the analogy with oil: the concept of sovereignty. It is argued that just as oil within its territory belongs to the country, so does data generated in the country. This leads to the “data localisation” principle, now adopted by some countries — especially for certain types of data — requiring that such data be stored within the country. One argument for this is law enforcement, especially in the context of money laundering and terror financing, where immediate access to financial transaction data may be required. Despite mutual assistance agreements, accessing data stored abroad is difficult and slow; locally stored data can be accessed more easily (hopefully, after due legal process).
Localisation votaries also invoke economics. Data is the vital ‘raw material’ for a host of AI applications.In this, given its large population and the extent of digital penetration, India is extremely well-endowed (data-rich). This is our comparative advantage in the digital economy and we need to capitalise on it by value-adding — through analytics and applications — rather than merely exporting the raw material. This protectionist paradigm, though, may not benefit Indian companies, since MNCs operating here can also access the data.
Is data localisation good or bad for India? If other countries too “localise” data, what impact will it have on India’s $200 billion IT industry, which depends on free flow of data? Will it spur and boost India’s nascent AI companies? Should one look at a more nuanced approach of what data must be localised only here, what may be exported but must be “mirrored” (stored) here, and what may be freely exported? These are more questions to ponder over.
There are other issues that stem from the huge amount of data being generated and stored, and the emerging capabilities of AI to take data analysis to an altogether new level. Analytics which uses data to model behaviour is being supplemented with AI to now predict behaviour. From here, it is not a big step to influencing behaviour. Further, the fact that Aadhaar provides a means to easily link multiple data about an individual is a cause of concern, since this can effectively open up the entire life of the individual to whoever collates and processes this data. With this and more, the “surveillance State” is already a near reality; as worrisome is the fact that not just the government, but corporates or other non-State players too could develop this capability.
Welcome, then, to the world of digital data — with all its goodies and its pitfalls.
Kiran Karnik is an independent policy and strategy analyst and a writer. His most recent book is eVolution: Decoding India’s Disruptive Tech Story (2018).