Workshop on Scientometric Database, 26-27 April 2002


The Workshop organised by the Institute of Informatics & Communication, University of Delhi and National Information System for Science and Technology was attended by the distinguished scientometricians, IT and LIS professionals of the country.

Dr Ashok Jain welcomed the participants to join the brain storming sessions for the two days. Dr A Lahiri, informed that NISSAT is going to appear in its new incarnation shortly leaving behind 25 years of its fruitful existence. National Mapping of Science has been a regular activity of NISSAT for quite sometime now. The reports that have been generated so far are rudimentary in nature, and the interpretation of the tables are not of very high order. The tables do not focus on the real issues. Intuitive interpretation of the databases are lacking. hat we are doing is OK, but incisive analysis is but what is required. Dr Lahiri mentioned about the closure of the SCI data analysis project midway. Attempts to revive the project is not yet yielding results. He also urged scientometricians to generate progenies as the present batch of scientometricians are mostly on the verge of retirement from the profession. They should be open-minded to pass on their expertise to the next generation of bibliometricians.

Dr B K Sen, Dr Ashok Jain and Mr B G Sunder Singh chaired the sessions and Dr B K Sen acted as the Rapporteur General.


It was not clear to the participants what are to be presented. Ms Kamini Mishra pointed out that the reports are not to be presented as such. The problems faced in the downloading, standardisation, analysis, etc. of data should be highlighted.

Mr S Arunachalam of M S Swaminathan Research Foundation took the floor and explained step by step how the data from CAB Abstracts was downloaded. Searching the database with India does not capture the entire data as many addresses do not have the country name India. To overcome this difficulty, first of all, all the place names where agricultural research work is going on were collected first. Then searches were conducted with each of these place names. After the downloading of the data, manual checking was done as some Indian names like Delhi, Kochi, and Salem are also found abroad. The next stage was the standardisation of data. Title of the periodicals were rendered in full as different databases render them differently. Synonyms, variations in place names such Calcutta and Kolkata, Delhi and New Delhi, wrong typing of place names create a lot of problems. This should be taken care of. Mr Arunachalam opined that address field is found to be the most difficult one. The publication year is another problem. The papers belonging to a particular journal year get noticed in the database in several years. Hence, for finding the data of a particular publication year, the database is to be searched for several years. At this point it was decided that for the analysis of data only the disc year of the database is to be considered.

It was urged that a comprehensive list of science cities in India which generate publications may be prepared. Ways and means need to be worked out involving the editors of scientific periodicals to ensure complete address in the publications along with country name and PIN code.

The limitation of most of the databases is that it gives the address of the fist author only. As a result, those multinational Indian papers where the Indian authors do not occupy the first position, cannot be located. Indian publications have been considered as those publications that have at least one Indian address. Another limitation of this database is that one cannot use it for any collaboration study. Only two databases in the world viz. SCI and MathSci provide addresses of all the authors.

The data elements used for the study are: author, address, source, document type, class codes. Keywords were not downloaded. The class codes as given the database were taken into consideration for subject analysis.

Generation of tables using programmes creates problems due to the variation in the rendering of journal names, author names and absence of country names. Here manual intervention is absolutely essential. To avoid the problems journal names and author names are to be rendered uniformly and country names added manually. For analysis journal country and impact factors (IF) were added manually.

The problems that were faced while using MathSci and BIOSIS databases are more or less the same as those described above.

Dr N C Jain of Indian Council of Medical Research, New Delhi described his experience on the mapping of medical literature using Medline database. He said that a program for the conversion of text files into dBase files is needed. A master list of 250 Indian cities which generate medical publications has been prepared. He opined that the impact factor of the disc year should be used as far as possible.

Mr SM Dhawan of National Physical Laboratory, New Delhi spoke on the mapping of physics literature using INSPEC database. He had no problems in extracting the required records with the country name India as India figured in the addresses of all the Indian publications. For the analysis of data from different angles, programs have been written to automate the process. To avoid the problem of standardisation of journal names, ISSN has been used. He felt that code names are necessary for identifying each and every institution. At this point Mrs Pandalai and Mrs Karanjai, both from INSDOC, informed that the work is going on and the code names for institutions will be finalised before long. Continuing further, Mr Dhawan mentioned that for analysis, type of organisations, country of publication were added manually. For generating a table of high quality publications only those papers have been considered which are published in journals having IF >1.5. Tables in the form of matrix will also be generated to link the organisations with the areas of their work. The project is expected to be completed in a few months

As to the utility of these reports, Dr Lahiri asked the audience to give their views. He himself cited three instances. In AIIMS, it was used as a tool to advocate for higher salary. Delhi university was happy to know that it stands at the top in respect of scientific research. Mr Arunachalam pointed out that for increasing the visibility of these reports he is publishing papers basing these data in journals like Current Science and Scientometrics.

Dr I K Ravichandra Rao from Documentation Research and Training Centre, Bangalore narrated his experience on the mapping of engineering literature using INSPEC database. He also faced downloading problem as India did not figure in all the addresses of Indian papers. Apart from India, searches were conducted with the names of big cities. Abstracts also were searched with the term India and Indian. These has also resulted in locating and downloading several contributions from India. Duplications were eliminated after downloading . For locating the data of 1990 as well as 1994, CDs of several years were examined. Dr Rao opined that the data is not good enough for this type of analysis.

Dr D B Ramesh of Regional Research Laboratory, Bhubaneswar highlighted the experience of the team gathered during the analysis of Indian earth science literature basing GeoRef database.

Mrs T A Pandalai in her presentation spoke about the mapping of Indian science using Indian Science Abstracts (ISA). Generally ISA covers about 25,000 to 30,000 publications per year. Data elements being included in the database are: author/s, author affiliation of the first author, journal title, year, volume, page, keywords, class numbers according to UDC, and the type of institution. The work is continuing, and in a few months time the work is going to be complete.

Dr K G Tyagi, Director, National Social Science Documentation Centre, New Delhi dwelt on the mapping of social science research in India. For this purpose Social Science Citation Index (SSCI) for the years 1998 and 1999 have been covered. SSCI covers about 1800 journals from all over the world, of which only 7 are from India. In the SSCI of 1999, there are about 2500 items from India. JCR/SSCI is not available in India, the same has been ordered and is expected soon. The major problem being faced is due to incomplete data. About 60% of the data are still incomplete. Missing elements are being gradually included to complete the data. Dr Tyagi pointed out that the coverage of Indian literature by SSCI is very less compared to the coverage of International Bibliography of Social Sciences, the latter covers about four times more Indian literature compared to the former. It is heartening to note that Indian contributions are found to be the highest amongst the developing countries. Dr Tyagi argued that a new method of computing impact factor is to be found out for social sciences as the useful life of social science literature is much higher than that of the scientific literature.

Dr Sanjeev Singh basically dealt with the patent database (Beta version) being created at the Institute of Informatics and Communication, New Delhi and also touched on patentometrics. Apart from patent information, the database also includes trademark data as well as copyright data. In its first phase the database has covered basic details of all Indian patents (N= 47,491) granted during 1974 -2000. The details of which are available on the Web site. The database includes the IPC no., the name of the inventor, keywords, the years of filing and granting, etc. The data is being culled from Gazette of India, Pt 3, Section 2. The database does not include the year of expiry, summary, full-text, etc.

Dr Satyanarayana suggested that the summary as well as the full-text of the patent should be included in the database. Dr Sen wanted that date of expiry should be included as the information is very vital for the parties intending to exploit the patent. Dr Rao wanted information regarding the exploitation of the patent, royalty paid to the inventor, time elapsed between the granting and exploitation of a patent to be included.

The participants felt that a meeting of representatives from patent centres, persons interested in patents, patent attorneys, etc. should be organised to expose then to this patent database.

Identification of the Minimum Set of Common Data Elements

Mapping of Indian research in science and social sciences is being done by different agencies in India. For the sake of uniformity it is necessary to identify the minimum set of common data elements. The exercise was undertaken at the post-lunch session on the first day. Dr. Ashok Jain acted as the moderator. The exercise helped to agree on the following points. In some cases, mandatory and optional elements could also be decided.

  1. Document type: 

Books, research articles, review articles, letters, short communications, conference publications, theses, reports, patents, any other [standards, articles in a collected work, book chapter] are to be covered. Source tags will be added with each type.

  1. Author

Author names will be rendered as it is ( i.e. as are given in the database). All the authors of the paper will have to be included. An author authority list needs to be generated.

  1. Author affiliation

  1. Institution type

  1. Journal title

  1. Language: Language of the paper is to be considered

  2. Treatment: As given in the database

  3. Publication year: As given in the database

  4. CD date: As given on CD.

  5. Volume number, issue number, year, pagination for journal articles may be rendered as per Unesco style

Scientometric Report Formats

Following report formats were decided:

National Scientometric Database Issues

The discussion on the National Scientometric Database Issues began with a lecture by Mr Naveen Singhal. He provided an excellent overview of a database covering among others database architecture and usage methodology, DBMS components, DBMS software, storage of data, retrieval and modification of data, triggers and stored procedures, application software, standard DBMS, popular alternates for DBMS, what to use- DBMS or alternates, preparing a database, who are involved, database design criteria, database design components, development of application software, accessibility of database, administration and maintenance.

As data from different databases are going to be merged to generate a single scientometric database, Mr Arunachalam opined that there is a need to standardize data elements to ensure uniformity. There should be a standard format, and fields should be identified for data entry. The database should be open to allow any interested person to access data for various types of analysis. Dr Rao suggested that the author field should be repeatable as multiple authorship is the order of the day. Dr Satyanarayana cautioned that downloading of the data from different databases might engender copyright problem. Dr Lahiri assured the audience that the copyright problem would be taken due care.

Identification of the Data Sources

Dr Ashok Jain asked the participants to name the data sources they have used. Mr Arunachalam pointed out that he had used CAB Abstracts. However, abstracting services like AGRIS, AGRICOLA, and ASFA may also be covered to comprehensively capture Indian contributions.. This gave rise to the question of mandatory and optional databases. Mandatory databases must be covered. Optional databases should be covered depending upon the feasibility and availability. The databases being covered now for the mapping are: SCI, SSCI, INSPEC, Chemical Abstracts, GeoRef, BIOSIS, Medline, CAB Abstracts, and Tropical Diseases Bulletin.


1. Publications containing at least one Indian address should be considered as Indian publication.

2. Ways and means need to be worked out involving the editors of scientific periodicals to ensure printing of complete address in the publications along with country name and PIN code.

3. As data of a particular publication year gets scattered over several years in a database, it was decided to consider only the disc year for the downloading of data. For example, while analysing SCI database for the year 2000, all Indian publications captured in the database during the year 2000 are to be downloaded. The Indian publications so downloaded will obviously belong to the year 2000 and before.

4. For the purpose of impact factor-based analysis, the impact factor of the disc year should be used. In case of non-availability, the impact factor of the previous year should be used.

5. A master list of Indian science cities should be generated taking care of synonyms and variant spellings to ease retrieval of data from various databases.

6. Author names are rendered differently in different databases. To bring in uniformity, an author authority list needs to be generated.

7. A meeting of the representatives from patent centres, persons interested in patents, patent attorneys, etc. should be organised to expose them to the patent database being developed by the Institute of Informatics and Communication.

8. In the patent database, world data on patents should be given alongside the Indian data.

9. The database should have the facility to generate graphs, tables, etc. automatically

10. Data elements frozen should be the data element in the merged database.

11. Any future exercise on national mapping should take into account the deliberations of this workshop.

12. Attempt should be made to achieve synergy among the subject specialists, scientometricians, statisticians, experts in operations research and mathematical modeling, to derive correct interpretations from the scientometric data. There is also a need to relate input and output of Indian research with expenditure. Data mining techniques should be used to generate good and useful knowledge.

13. There is need for developing (i) science indicators and relating the same to socio-economic indicators, (ii) technology development measures, (iii)data on S & T manpower and expenditure, and (iv) models for growth of knowledge; (v) identifying gaps in research publication data; (vi) measuring a lab's contribution to the nation.

14. Subject experts should be invited to decide sub-fields in their areas on which national mapping studies need to be taken up in the next two years. Scientometric studies in atmospheric sciences, veterinary sciences, biotechnology and other areas may also be taken up provided subject experts express the need. Institutions and experts should also be identified to do these studies . The PAC on scientometrics will decide about - what new areas needs to be taken for studies and how to continue the earlier studies.

15. Significant Indian databases should be used for all national mapping studies.

16. To ensure comprehensive data capture LISA, Environmental Abstracts, Pollution Abstracts, EMBASE, PASCAL databases may also be used apart from the ones already mentioned.

17. Scientometric training programs for imparting skills for new entrants in this field should be regularly organized.

18. Scientometric workshops for all scientists embracing scientists from labs, library and information professionals, science administrators, IT experts, scientometricians, statisticians, system analysts, operations researchers, mathematical modelers, policy makers, decision makers, educationists, etc. should be organized at regular intervals.

19. The existing data should be merged and standardised. Then the experts should be called to study the data. Two years of regular study should be commissioned to generate data.


Mr Punya Palit dwelt on the Microsoft SQL 2000 and Mr K V Bhaskar Badrinath elaborated on the Oracle 9i features. The organizers provided a CD-ROM of the cleaned up scientometric databases derived from source databases like MATSCI, CABI, BIOSIS, MEDLINE, INSPEC, GEOREF etc. to the participants. The structure and content of each database were alto do these studies so shown. It was felt that the actual merging of this data into one scientometric database though not impossible, would entail time and manual efforts. The Workshop ended with a vote of thanks to the chair.

4th Annual National Convention of MANLIBNET on Paradigm of Information Technology Application to Business and Management Libraries, 2-5 April 2002

The Convention held at National Institute of Financial Management (NIFM) Faridabad was attended by about 50 participants from all over the country. Mr P Jayarajan, Head, Library & Information Services, The British Council, India inaugurated the Convention. Dr T A V Murthy, Director, INFLIBNET and Prof C R Karisidappa, Head, Department of Library and Information Science, Dharwad University, Karnataka delivered the theme address and keynote address respectively. Dr S D Khan, Librarian, NIFM and the Organizing Secretary of the Convention introduced the dignitaries on the dais and gave a brief introduction about the Convention and MANLIBNET.

Prof A N Saxena in his welcome address opined that the theme of the Convention is very useful and timely. Such events, he maintained, would not only help the librarian but also the faculty . He expressed his happiness for organizing the Convention at NIFM, Faridabad.

Mr. Ashok Jambhekar, President, MANLIBNET recalled the thinking that went behind the formation of MANLIBNET. He stressed upon the need for such networks and more so for management libraries that are facing various problems such as budget constraints, poor infrastructure, and improper pay scales of library staff. He wished and hoped that with the cooperation of all MANLIBNET would achieved it set goals.

Dr C R Karisidappa, in his keynote address entitled Application of Information Technology in Business and Management Libraries- Challenges and Opportunities highlighted the various issues concerning the business and management libraries.

Dr T A V Murthy in his theme address discussed the. paradigm of information technology. He informed that the Government of India has allocated Rs. 110 crore in the 10th Plan particularly for the modernization of libraries. He stressed upon the need for digitization of Indian management libraries. Dr Murthy concluded his address by highlighting the importance of e-factor such as e-commerce, and e-learning.

In his inaugural address Mr Jayarajan maintained that we should have some parameters to evaluate a library. The parameters for categorization may be based on such factors as: i) library resources; 2) infrastructural support; 3) interaction with the student and faculty; and 4) opening hours. He opined that benchmarking is very important in case of a library. Some libraries may be bench-marked to design and match the requirements of the users. The ultimate aim should be total user satisfaction.

The inaugural session ended with a formal vote of by Dr S D Khan.

Technical Sessions

In all sixteen papers were presented in seven technical sessions chaired by Prof R L Raina, Mr Akhtar Parvez, Mr M M L Goyal, Dr S S Shirunath, Mr Partha Bhattacharya and Dr P R Goswami. The papers covered such diverse areas as knowledge management, online information retrieval system, digital libraries, information services, information marketing and information technology.


Vendor Presentation

  • A presentation of LIBMAN- library software was By R. S. Enterprises.

  • Verghese Electronic Publishing also made a presentation of their products.

Panel Discussion on Challenges of Digital Libraries

Dr A Lahiri, Mrs Kalpana Dasgupta; Mr Ishwar Bhat; and Prof Ashok Hernal took part in the panel discussion and expressed their views.

The Convention came to an end with formal vote of thanks proposed by Dr S D Khan.

-- Reported by Ramesh C.Gaur
Chief Librarian and Business Manager
Institute of Management Technology, Hapur Road, Rajnagar, Ghaziabad-201001(U.P.)

20th Training Programme on Application of Computers to Library and Information Services, 10 September - 6 October 2001

INFLIBNET Centre of the University Grants Commission (UGC), successfully organized the 4-week training programme at Ahmedabad. The programme was specifically meant for recently funded nineteen universities that received Rs. 6.5 lakh each from UGC to modernise and automate their libraries.

The content of the course covered all aspects of information technology related to library automation such as computer fundamentals, operating systems, databases, networking, library management software _ SOUL, Internet, CD-ROM technology, Barcode, library standards, multimedia, e-learning, and search engines. During the course both theoretical and practical aspects were evenly balanced with emphasis on hands-on practice.

Participants undergone the training programme will start database creation using the software SOUL. INFLIBNET will make these databases available online as soon as they are ready through its Web page at

-- Reported by Prem Chand, INFLIBNET Centre, Ahmedabad.

XX National Seminar of IASLIC

The Seminar is going to be hosted by the Punjabi University, Patiala during 27-30 December 2002. The theme of the seminar is Digital Information Systems and Services. Papers are invited in the following areas for deliberation:

  1. E-Libraries and Virtual Libraries

  2. Content Development

  3. Networks and Networking

  4. Internet, Intranet and Extranet:

  5. Converging Technologies

  6. Multimedia

  7. Digital Library Applications

  8. Digitization in Indian Context

Special Interest Group (SIG) Meetings

    SIG Computer Application
    SIG Library & Information Science Education
    SIG Humanities Information
    SIG Social Sciences Information
    SIG Informetrics
    SIG Industrial Information

Coordinator (HQ) : Sri Salil Chandra Khan

Correspondence regarding papers should be addressed to the Convener, Editorial Committee, XX National Seminar, IASLIC, P291 Scheme No. 6M, Kankurgachi, Kolkata - 700 054.

-- Dipak Kr Nag, Hony. General Secretary,
23 March 2002

