Databases and Information Management -
Perspectives and Problems
Corporate & Economic Research Centre, Bangalore - 560 095
Decision making whether for business, government or an individual is based on a central element information. Information forms the basic input for successful analysis of the current situation, evaluate the possible options and select the action to be taken. When this decision relates to business and economy, much of the elements arise out of information or data, often flowing out of official institutions of the government. These institutions are often self centered and are not aligned to specific requirements of the various users. Much of this information and data are released on formats and templates which are atrophied or less adequate to suit ready references. This is not a lacuna only in India but also in advanced countries, international agencies and development institutions.
It is at this juncture that a database organization could step in and make available information suitably packaged and presented. This is a situation not easily tackled but essential to be addressed as a signal of a `global challenge'. Leaders in the field need to focus attention on the future needs and join in the effort to bring about the changes that are clearly needed.
Information is a random bit of recorded event or a happening with more qualitative bias and statistics is more quantitative. Data means that statistics which is maintained regularly like population figures recorded by the census machinery of each country. However all these words are used intermittently to mean the same. The descerning people are those who are charged with the zeal of making this information perfect.
Values of Information
There are values attached to each class of the above. Information is now a broad based term to include news, statistics and datum. Information is all composite to mean relevant benchmarks for decision making. A country relies more on statistics, an academician on data and a news caster on just news. Each of them have value at different levels. A government official relies on statistics which is recorded for a specific period of the past which relates to a future occurring, like price statistics of retailed products is reviewed to judge the inflation levels for making fiscal policies. To a businessman a short term record of both internal and environmental aspects of his business is essential to make decisions. To a journalist, events of the previous day are of supreme importance.
Similarly, the exactness of information also can be different at each level. The utility values of information and the perfectness of information can also be different at levels. Policy makers look at broad indicators in making decisions on allocation of governmental budgets and priorities. Some kinds of information are dealt in such abstract terms, like derived incomes of individuals, or opinions towards governmental policy that they need to be addressed with a lot of leverage. These assumptions of value of statistics or information underlie any system of statistics in any field. Since the information we get is inherited we know it has value but find it difficult to dispose if found unworthy.
With this system prevailing, we often make and suffer a serious shortage of useable knowledge and understanding. In this dilemma we call ourselves products of an `Information Age'. The situation is similar to standing thirsty underneath a waterfall, but unable to hold the flow as we have no crucible to hold it.
Over the last decade development of databases to capture all frontiers of information have sprung up. In the past two years, especially in the under developed world, this discussion on information management has become relevant. We must be able ultimately to identify good information from useless, value addition to raw information and reduce duplicity in such an effort. While we set to overhaul and tinker with the existing systems of information there is an urgent need to seek faster and economic dissemination of such data.
The two issues that need a review are: the information age and the changes in the world that seek a new statistical system.
The growth of technology especially that of electronics and telecommunication has opened up the world and ushered in the second most important revolution since the industrial revolution. It is indisputable that people are exposed to more information than ever before. There can be both favourable and criticism on this information explosion. Looking at all this growth in information gathering, analysis, and its distribution there seems to be a big demand. Nevertheless there is a need to distinguish between good and bad quality of information disseminated.
There is an urgent need to separate the information from non-information. Information really facilitates decision making, while non-information remains a mass of numbers or a set of figures. Examples can be found in our daily lives. There is a need to know the real value of money but information on money traded in the grey market is a non-information. The essential discussion here is that information packed and presented for consumption and forms a decision support system. This has also become critical with the political changes and economic upturn in many dormant countries.
The New Statistical System
Statistics published by many governments are almost similar in form and coverage. They have been used for measuring progress of the country or otherwise. The information was presented on a time series on a consistent measure and comparisons were made against two time periods. Drawing a derivative from that many business used to predict the future prospects which is a good measure but needs a lot of disaggregation and further massaging.
The second issue is that these statistics are constantly changing in their analysis and dissemination. For instance there was a need for super computer to store, analyse and report any macro level statistics. Presently with technology progressing, storage space is made possible with even personal computers. While this revolution is going on a daily basis, the need for information is becoming more customer specific and so the maze of data generated need to be reviewed in the light of this.
There are generically four types of information generating organisations : National Governmental agencies, international institutions, business firms and their associations and finally, information industry firms including nonprofit agencies. Each of these institutions bring out a specific variety of information meant largely for their own use except for the last mentioned information packagers.
The major criticism about information provided by all these agencies can be clubbed together under two heads as follows :
i. Publishing data that are relevant to a large heterogeneous users;
ii. Aggressive marketing to actual and potential clients.
The critism applies largely to the governmental agencies, international institutions, and business groups. The information published by them are largely for their own use and have no orientation towards their users. Secondly many of them are either averse to giving out information or do not have the required machinery to handle their marketing.
These imply that information marketed needs to become more user-friendly. There are other issues like many governments objecting to transnational exchange of information. It is important that for an information packaging and marketing company, these issues must be protected or those governments need to be educated to such exports of information. It is seen from the activities of many governmental agencies involved with generating statistics, the emphasis is more on collection of information rather than dissemination. Much of the budget allocated for the activity of information generation is used up by the massive field teams, leaving little for dissemination, thereby causing most of the data collected to be archived. For example, in India we estimate that nearly 85% of the census information on raw data gets stored and does not get published for want of funds. There is an urgent need for these agencies to come out of the `officialdom' and move into a more entrepreneurial method of working.
Issues of Government Generated Information
Henry Kelly and Andrew Wykoff argued that the governments need to be questioned on more customer focus of the information provided. They also opined that the government machinery takes a long time to turn around figures collected into valuable information. Thirdly, modernisation of dissemination techniques has made it mandatory for the governments to change, while many resist the change. Finally, speedy dissemination could reduce the expenditure by many private agencies invested in estimating the statistics. The expenditure can be reduced if the government publishes information on time.
There is no doubt that there are major changes needed, and if they are implemented, will result in better decision making at the governmental level and better appreciation of their policies by the private sector.
At this juncture one should not concede that government needs to disengage from the activity of information collection and dissemination. Governments are in the unique position of providing continuous and stable statistical services. With the effect they provide long-run perspectives of the economy. Thirdly there is no profit orientation in information collection and dissemination. Fourthly, there is no bias in reporting towards any sector or section of the population. Finally no other data provider can be more user-friendly than the government as they are represented by the people. This attitude is developing fast in India as many Government agencies are now prepared to discuss their outputs and also show keen interest in collaborating with private agencies either in processing, publishing or marketing activities. It is our experience that many Governmental agencies realise this aspects and are willing to proposals of data massaging and dissemination. Any agency proposing to collaborate with the Government should respect the three main issues of concern as follows :
- Coverage and reliability;
- Methodology ; and
- Integrity and objectivity.
Collaboration on Database Activity
The areas of collaboration between the Government and the private database and information packagers needs to be looked at in the current mood of liberalisation and allowing a scene of transparency into many hidden areas. The areas of general interest or a consensus would be in the area of information which helps decision making. It would also mean that the Government should realise that the private sector is the prime mover of the economy and therefore it is relevant for the latter to play a role.
The areas of importance for the private sector would be in terms of market information, finance, trade and prices. Market information covers the data related to flow of goods from production/trader to buyer/consumer in terms of volume, price and quality. The finance information relates to a maze of issues covering banks, taxes, incentives, outlays of the plans, trade balances and so on. Trade information would mean that part of information relating to import, export, government purchase and deemed exports. It could also mean the information related to wholesale trade, government purchase, retail trade, regulated markets and so on.
The issue of private sector involvement with the collection, collation and dissemination of data needs to be discussed as three separate compartments. The reason for this is that there are various statutes and rules dealing with the collection of data and information. A lot of official secrecy is supposed to be maintained at the enumerations phase and when it is collated. The other issue is that there is a fear of disclosure by the respondent if the information is given to a private institution. Currently, they are protected by laws and statutes guarding their individual secrecy. However there would also be resistance from the statistical organisation themselves towards any privatisation effort.
The discussion is done in three parts; collection, collation and dissemination.
Collection of Data : An Issue Dogged by Statutes
The efforts of the Government in this area is tremendous, but more affected by bureaucratic delays and red tape. The census operation is an example where the data collection starts nearly a couple of years ahead of the enumeration, with envolvement of all teachers in the country. This leaves a serious doubt on the credibility of the collection mechanism and the output, as the field force does not have interviewing skills nor the focus of the end product.
The laws relating to census, taxation and other statutory disclosures regarding goods imported can be a few examples of how this collaboration can be marred by legislation. The census activity is large and the machinery required is geared up for nearly ten months in advance and then all schedules are administered by partly trained personnel. This alone causes one of the perceptions that the output is defective. The laws binding the process is based on the fact that every individual must give information truly and therefore the respondents identity needs to be protected. The question is if the process of enumeration is privatised, does the information get the same secrecy as it would under the aprons of the government ? The other survey which is similarly affected is the Annual Survery of Industries. The respondent to an extent is heuristic in filling the forms. If the same is to be privatised, then the question is, will the respondent be willing to part with facts under the fear of its misuse? The statutes for collection of information like the Factories Act, Mines Act, and Payment of Wages Act, are few other areas where private data collection may not be feasible. However, there are other areas like the Household Survey (NCAER), Price Survey, and a host of the surveys launched by the government need to be moved into the joint sector. An attempt should be made to clearly move out of the present impasse where there is a large demand for information, but it is getting bottled up in secrecy because of legislation.
The Central Statistical Organisation coordinates with the state directorates of economics and statistics, and therefore becomes more bureaucratic and gets bogged down by procedures and process. The same is the case with all other institutions involved with collection of data and information. The efficiency and effectiveness of the statistical bureaus in the states were addressed at the First Conference of State Ministers of Statistics in April 1981. This body discussed the improvement needed by the system, and no visible change is seen in the enumeration process or in perception of the user of these data. The organisations of the government machinery also interact with a host of international institutions, making it more urgent to rectify the lacunae in the statistical and field operations of various bodies under the government.
The Indian Statistical Service set up in 1961 should have seen some professonalism in its operations but it has also gone the way of other red tape organisations. Therefore it would be difficult to expect any of the discussion on privatisation to be taken rationally at any of the institutions for they fear loss of jobs or the current level of comfort.
Collation of Data : An Issue of Having no Willingness Nor Ability
The next stage of work after collection of data is collation, sifting and layout. In this area the existing machinery is so outdated that it takes nearly one decade to finalise the census data in its full, and by that time another enumeration exercise has to start. The absence of OMR techniques, online entry, decentralised EDP, and the absence of machinery to handle such a large volume of data makes it urgent to think radically on developing some synergies with involvement of the private sector. While some attempt has been made on this the purchase procedure of tender, lowest quote and registered dealer makes the process more cumbersome. If an agency could ensure all these concerns, there are numerous opportunities for collaboration between the governmental agencies and private information management firms.
Governmental agencies in the business of collating and disseminating statistics and information have some in-built problems which can be reduced largely by this partnership with private agencies. Firstly, there is no cost benefit analysis of information collection and publishing. Secondly there are inordinate delays, rendering the invaluable data becoming just figures. Many of the governments are manned by politicians who have their own vested interests, and could bias the information generated and disseminated.
While we could argue on the issue for endless time, a look into the future is important in as much to overcome the criticisms and develop a dynamic information network.
In the United States, an example of State_private collaboration can be seen with the SEC and a contractor_ Disclosure Inc. getting together to publicise all the returns filed by the business in the country. It resulted in a great technological leap with the information disseminated through CD-ROM and online.
Opportunities do therefore exist to explode the bubble of officialdom into a free world of authentic and timely information through a collaboration. It is important to note that the private individual has to be shown an incentive to come out with exact information. The private individual cannot be enforced to give correct responses to questionnaires unless they feel the utility of providing such information benefits them largely. This is largely encouraged by a partnership between the government and private agencies in collecting and disseminating information.
The areas of cooperation may be in the development of software, machines and support for EDP, manpower for job work, and making more value added reports and data available. The areas need to be judiciously developed as there are unscrupulous elements who may use the data for commercial purposes once the collation is completed. This area is definitely attractive and feasible to be privatised.
Dissemination of Data
The need for information is so urgent that many of the government sponsored publications fail to cater for one simple reason _ delay. The currentness of the information is one of the basic attributes of any information which is packaged. The agencies in charge of dissemination are pressed with the problem of budget, purchase procedure, secrecy, and no customer presence alongwith lethargy and official disinterest in what is not important for them. With the effect, a lot of information which is commercially a cash cow gets swept under the bureaucratic carpet.
The problems of data dissemination does not end at that point. The bureaucracy is shakened at times for revelations of government views and vibrations at decision making levels where political bosses are in command. The documents like mid-term plans, five year plans, economic survey, public enterprises survey, labour statistics, etc. always considered to be `closed door' operations although they become tremendously backdated or irrelevant when published.
Constrained by this factor, we often find even at international levels serious lack of proper dissemination norms. The global statistical bodies find constraints in projecting India in its actual performance. We suffer because the positive facts about our progress remain unreflected and the projections made by these agencies highlight agonies of passing through the government maze.
It goes without saying that India is one of the earliest to develop those statistical data bases considered as one of the best in the world. But because of lack of dissemination machinery, the system often gets confused in the state of to be or not to be. Hence results in delay and atrophied information. It is often found that in India the administrators manage data while real analysts like economists and statisticians find themselves at the receiving end.
During the last four decades India has built up sufficient expertise in many areas of official statistics. The experience is shared with other developing nations by providing expertise on bilateral and multilateral issues under the various technical assistance programs. India can assist other nations in conducting the following :
(i) Socio-economic survey
(ii) Population census
(iii) Agricultural census
(iv) National accounts
(v) Economic census and service
(vi) Statistics for integrated rural development projects and
(vii) EDP areas.
However the issues of dissemination is bogged down by officialdom. The whole gamut of information can hardly be handled by the administrative machinery which is managing almost 75% to 80% of `Primary' data. The rest of 25% is the `Secondary' data sourced from corporate research bodies, private agencies who are engaged in their own value analysis. Without dissemination, collection and collation serves no purpose. The commercialisation of information, therefore, needs a professional approach, which is only possible by a marriage of sorts, between the government and the private institutions.
The current environment in the country has seen the media becoming conscious about information including the television which has started projecting economy, technology and corporate news. This media needs to be properly tapped which could not be done by the State owned TV network. After the Telcom Policy, satellites and networking have become the order of the day reducing space and time of commercial institutes and household owning computers. Linking them up through modems either for fax or for mail is becoming popular. All business houses seek online information from databases, therefore, dissemination media is available. The linkage between collection, collating and dissemination need to be made which obviously can be done by the private sector.
Therefore, the time has come to clean up the cobwebs covering the mammoth machinery which is sleeping and utilize its golden resources for public good. The only way is through continued collaboration between government and private sector and the time is now, ripe.