Scientific work and the usage of digital scientific information - some notes on structures, discrepancies, tendencies, and strategies

. The article discusses changes in scientific work (academic and applied) associated with new potentials, but also coercions of information technologies. Background for this interest is the experience gained in several digital library projects that inclinations and willingness to use these technical possibilities is much less common than the developers of these systems, and we all, tended to think in recent years. This seems to be true even in those scientific disciplines which were and are at the forefront of the development, e.g. physics, mathematics, etc. The background for this observation is discussed looking at general economic and social changes, viewing the environments of work in the scientific sphere, the contents and their quantity and quality of supply in scientific IT systems, the user side in their communities of practice, and the technological and organizational basis of scientific information. Some strategic issues to improve the situation are discussed in the final part of the paper.

communication media spread in the last thirty years.It was on this economic and social basis that IC technologies could enter their expansive and revolutionizing career which led theorists to coin the phrase of the "information society" (cf.Lyon 1988;Schmiede 1996b).Furthermore, information has become reflexive: Processing information creates new information.Information is a formalized abstraction of reality.In this world of abstract information -quasi in a second reality -one can combine informations, process them, model information-led systems, and simulate their working in reality.Then, the desired result is transferred back into the (first) reality and given a real material form: Information changes and shapes reality.(Probably, one of the most impressive examples to understand these processes in two worlds of reality is the virtual construction of a car by processing information sets delivered from engineers of the assembling firm and many suppliers and subsidiaries, up to the simulation of certain properties of the future car in the computer.)Innovation is generated by processing information, and it is used in a cumulative feedback-loop to generate new innovation.In other words: The technical form of knowledge, its information form, is the step from conventional technification and automatization to informatization (Schmiede 1996a;Spinner 1998, p. 75).This is the economic and social background for scientific work and its use of digital scientific information to have become and still being in the process of developing towards a crucial resource of economic growth and social dynamics.The usage of digitized scientific information is in no way confined to the academic sphere: It is estimated by the Central Statistical Office that in Germany about 70% of national expenditure on research and development is spent in the private sector of the economy, only the remaining 30% in universities and research institutions outside the universities.So, in our discussion of some moments of structure, problems and perspectives in the usage of digital scientific information below it has to be kept in mind that we are talking as well on academic tendencies as on structural changes in industry and administration.

Scientific work and digital libraries
As for the internet in general, for many years it was physics and some parts of mathematics who initiated building and using the largest digital scientific database, the reknowned Ginsparg or Los Alamos server (since a couple of years "ArXiv" database).With the American Digital Library Initiatives and parallel activities in many European nations since the mid-nineties, a new phase of dissemination, popularization and technological progress in DL development took off which led to a multitude of new digital libraries and many scientific disciplines joining into the process as well as new kinds of information and objects being included.
In Germany a combined initiative developed to get the different scientific disciplines and learned societies to cooperate on the one hand, to include the commercial database providers and publishers on the other hand.The so-called IuK-Initiative of Learned Societies was founded in 1995 by the societies for informations science, physics, mathematics, and chemistry and in the years to follow attracted not only the traditionally technology-oriented disciplines, but also sociology, pedagogics, biology, sport science and others.Web-based information networks in mathematics, physics, and later in sociology and special digital information services in other areas had a considerable impetus towards the dissemination of the usage of scientific digital information not only in universities, but also in industry.In the Global Info program from 1997 to 2000 the interdisciplinary cooperation and the collaboration with the commercial suppliers were consciously advanced; a whole bunch of joint projects, some of them working until 2002, emerged, and German activities opened much more than before to international developments (cf.Schmiede 1999).
Since, however, the dynamic momentum of these initiatives has to a considerable extent disappeared.Not, that DL activities generally have come to a halt: There are numerous digital library projects nationally and internationally; the scope of research and development activities has rather been enlarged including in recent years new areas like museums, films, and archives, extending the scope of technological development to questions like long-term preservation, integrated desktop services and, most recently, designing new open architectures on the basis of web services technologies (cf.Payette/Staples 2002;Stoll et al. 2004).The Open Archive Initiative has substantially enlarged and improved availability and access to digital resources in various areas.And the provision of digital content today belongs to the standard tasks of most scientific libraries with a number of innovative activities.
In contrast, the IuK initiative of learned societies mentioned above is in a bad state.The web-based information networks in physics, mathematics and sociology advance slowly, but they have not developed to become a central communication and cooperation medium in their respective disciplines.In the German Research Association (DFG) led projects creating virtual subject libraries in various disciplines and in the Federal Education and Research Ministry (bmb+f) led projects heading towards a national digital scientific library (vascoda) with interdisciplinary subbranches in medicine, economics, technology and social sciences, the DL development is re-concentrated with the traditional scientific database information providers and a number of leading libraries in Germany.And the cooperation with the commercial publishing world once envisaged in Global Info did not evolve to be stable but was confined to the projects in the course of this program; it has dissolved to close to zero since.
In sum, at least looking at the situation in Germany, digital library activities -used as a synonoym for the systematic usage of digital scientific information in the work of scientists -did not succeed to overcome their fringe status in sciences and humanities hitherto.Although, to a certain extent, using the internet and its resources has become part of everyday work of many people doing scientific work, the vision of the DL movement, condensed in the general Global Info aim of providing "world-wide information at the indvidual scientist's desktop", is far from having become reality.

Changes in Scientific Work
One might list a number of political or contingent reasons to explain this development.They account for one or the other special feature of the situation in Germany; they are not, however, a sufficient explanation for the problems mentioned.
My impression is, first, that this state of affairs is by no way limited to Germany; even if DL movements are more vivid in the USA or UK, the inroad into everyday work of researchers, teachers and students as well as researchers and developpers in industry has not been found yet there, either.Secondly, I doubt whether this reflects principal differences between sciences and humanities; rather, the same deficiencies (albeit with gradual differences) seem to be true also in those scientific disciplines which were and are at the forefront of the development, implementation and usage of advanced systems of science information, e.g.physics, mathematics, medicine, biology; they are the more prominent in social sciences and humanities which are traditionally more framed into their national cultures, languages and habits.
As a consequence, I am convinced that an analysis has to look a bit deeper into the relation between changes and continuities in scientific work on the one hand, the use of resources and instruments of digital information on the other one.Unfortunately, there is not yet any systematic research on this relation available.There are studies on media usage in special environments (e.g.Berker 2001;Goll 2002).On the other hand, there is research to identify and describe communities of practice, but mostly without special attention to the use of digital information and related work practices (cf. the case studies in Huysmann et al. 2003).So, in the following paragraphs, I will present rather questions and theses than results.This might raise awareness that there are hidden problems and emphasize the necessity to deal with them in the future.
A very simple economic model may help to specify the possible factors contributing to the differences between supply and usage of scientific information: There seems to be a more or less pronounced divergence between the supply of scientific information facilities based on information technologies and the demand of acting scientists for IT-based scientific information.The theoretical options to explain this mismatch are limited: (1) Supply exceeds demand quantitatively, or (2) does not meet the demand qualitatively, with its contents, or, as a special case, (2a) it is primarily technology-, not content-driven; (3) demand is sluggish because there are no measurable or sufficiently susceptible advantages in using the supply, or (4) because supply is too expensive (in terms of workload: it demands too much effort to be traded).These options describe analytic categories to approach the problem described, but they have to be translated into real questions concerning the field of scientific information, knowledge and work.
In this paragraph I want to deal with some of the characterstic moments of the demand side, i.e. of scientific work itself and the scientists.A first group of questions and theses which I want to go through concerns the environment of work in the scientific sphere.(1) Have contents of sciences and humanities changed because of the introduction of informatized objects and methods into most scientific disciplines?
The answer is a cautious, but definite Yes.In the quantitative dimension facts, relations and structures can be modelled because of informatization which so far could not be treated due to their sheer size.The terabytes of information which are delivered day per day in the big international geological and geospatial projects; the modelling and calculation of properties of substances in chemistry; the calculation of properties of free forms by systems of infinite equations in mechanics; the modelling and visualization of energetic processes in thermodynamics or in construction engineering physics; the recognition of patterns and the numerical comparison of gene sequences in biogenetics; but also the voluminous statistical calculation of cluster structures in the sociological analysis of social structures or in the economic investigation of input-output-matrices which allow for new insights and dimensions of analysis, are but some examples for the enormous potential of informatized procedures in science in general.Methods and technologies of simulation today are playing a central role in what Daniel Bell thirty years ago called "intellectual technologies" (Bell 1973).In the humanities, new methods of analysis of texts, symbols, figures and pictures, i.e. in the more qualitative dimension, are imminent; however, computer philology is still in its beginnnings.Informatization in scientific work goes along with new objects, new standards and norms: Virtual construction processes in mechanical engineering are based upon massive efforts of formal or defacto-standardization ob technical objects; and the normed definition of diseases by ICD 10 (the International Classification of Diseases) has enormous scientific and practical consequences in medicine, e.g. in form of acceptance or rejection by health insurance institutions.So, my answer to the question posed above is: The examples listed show substantial changes in the contents of sciences and humanities, but we do not really have a systematic overview on their dimensions and extent, yet.
These changes are mainly on the content side of scientific information.Are there correlates on the user side?More specific: (2) Have working habits and conditions undergone a change due to the omnipresence of IC technologies?Have communication and cooperation styles of scientific communities come up to the expectations the technological possibilities of IT seemed and still seem to promise?These are the questions to which I know only few answers so far.We know that networks of peers are a common structure in various scientific spheres; we also know that network structures in the working of scientists are on the increase; we are also familiar with the traditional ways of networking of scientists via conferences, workshops, journals etc.But we have hardly any indications -apart from personal experience and impressions from colleagues -of how this working together is done, and especially, how it is conducted as far as the ICT is concerned.So to deal with the above questions I can only express my guess that neither working habits and conditions nor communication and cooperation styles have really undergone comparably dramatic changes as the environmental conditions certainly have.My hypothesis for the necessary studies in this sphere would be that by and large communication between scientists who cooperate is essentially conducted by exchange of papers and the use of telephone and mail; adequate collaboration systems seem to be absent -be it because of their own inadequacy, be it because of conventionality or ignorance on the side of the acting scientists.
A third group of questions (3) complementing the user side of digital scientific information arises from these deliberations: What are the relevant communities in the respective fields?Does electronic communication and collaboration offer significant advantages to them?Is there a tradition to exchange working papers, data etc. in printed or digital form?Is the single scientist supported or discouraged by his or her environment to systematically use electronic facilities and publish and communicate in digital form?One often neglected dimension of publishing has to be recalled at this point of the argument: Publishing is not just the technical multiplication and dissemination of a text or other contents, its more or less successful bringing into the market; to solve this task organizationally and technically, is the easier part of the problem.The more difficult one is dealing with publication as part of the working mode of the scientific social system.Publication plays a crucial role in demonstrating and allocating acknowledgement, status, functions, jobs and remuneration in the world of institutionalized science.Journals, series, and scientific publishing companies in general are sources of honour and reward, of power and influence, andlast but not least -of income for learned societies.My impression is that electronic publishing so far has not provided a functional substitute for this system.The wellknown guess that around 90% of scientific papers on the ArXiv server are later published in a printed journal suggests that the excellent solution for the quick and cheap dissemination of scientific innovation which this service is providing does not seriously impede the working of the second crucial social process of publishing as allocation mechanism in the scientific system.
There is one additional consideration to be mentioned concerning the consequences of the availability and use of world-wide scientific information systems.These facilities might help to increase national and international competition in scientific fields for they help to create world-wide markets for scientific information.Strongly canonized scientific disciplines as e.g.large parts of physics or mathematics are familiar with working in the context of a global presence of their respective community.So, it is probably not accidental that the first world-wide scientific information system (the mentioned Los Alamos server) originated in these sciences.In contrast, in many fields of social science and humanities the reference space is by tradition rather culturally or nationally defined.Here the advantages of the new systems might be more difficult to see and be counteracted by possible real or alleged threats to the own position in the scientific context associated with the anticipated increased transparency of global information systems in science.
To sum up the argument of this paragraph: In terms of the economic model sketched above, we seem to have a combination of options 2 and 3.The supply of electronic tool systems does not seem to meet the demand qualitatively; obviously, changed contents are important, but they don't seem to be processed within the new available electronic communication and cooperation facilities.Turned the other way around: Available systems do not seem to offer advantages substantial enough to use them instead of conventional ways of information, communication and cooperation.

Technological advances, organization and business models
To round up the picture we have to add a closer look at the supply side of changed scientific work, i.e. information supply in the various sciences.A first group of questions and theses (1) in this field aims at the technical characteristics of scientific information systems: Is supply of electronic information in the respective fields organized in a centralized manner, usually as one or few central databases, administered and kept by some central agency?(This usually implies more or less severe selections of contents.)Or do decentralized information structures exist in the field which are apt to react to the continuously changing information and communication modes in the sciences?This has consequences for the access possibilities of the single scientist as a user and as a producer: For the user, centralized database structures usually go along with more or less specialized retrieval languages and routines, so that in the worst case I have to learn and keep in continuous usage a special language for every source.The alternative is the webbased (i.e.browser-based) access common to decentralised web oriented information structures; here many attempts are made (and considerable progress has been achieved) to incorporate advanced retrieval options into user-friendly interfaces.In the role as producer (in science, most users are producers at the same time), the question is how I get my products into the publishing system.Do I have to deliver special formats, specialized metadata etc.? Do I receive support by the system to publish, to mark up the publication and to get it into review systems?The open character of the information and publication system depends on technological preconditions in the form of the support of current standards (DC, XML/rdf, OWL, WSDL, OAI-MHP etc.).Are they adhered to, how far are they implemented?Is the system's architecture adaptable to changing needs (e.g. to SODA-like structures)?The alternative of centralized vs. decentralized information systems is not only a question of competing technologies; rather these are adapted according to social circumstances and interests.Centralized systems are usually run by centralized service institutions, often employing hundreds of scientific and administrative staff.So changes in organizational structures, especially by introducing elements of bottomup activity by scientists, considered to be lay people in terms of information technology and documentation by the professionals, tend to entail bureaucratic counteraction by the latter ones.On the other hand, their attitude is often supported and justified by the complementary disinterest of working scientists concerning questions of publication and documentation.It is especially difficult to turn this vicious circle into a virtuous one.
A second group of questions and considerations (2) in this paragraph relates to the contents of scientific information systems and their availability; it has to deal with their organizational and economic conditions: To which extent are contents publicly, to which extent only via the market available?E.g. in physics most contents seem to be easily and early accessible via ArXiv and complementary ways, whereas in chemistry most important contents are published first and exclusively in journals of the leading publishers.How far do relevant contents exist in digitized form?In the more canonic sciences (mathematics, natural sciences) most contents are available in digital form, whereas in humanities and social sciences only unsystematically selected contents seem to be available electronically.A good measure to evaluate this situation is to answer the question whether a scientist in his or her everyday working environment is able to do this work without repeated media breaks (this will prevent him or her from using systematically IT sources).Another question of this group aims at the quality of electronic information: Is the available information structured by metadata accepted in the community and eventually evaluated as to its reliability and relevance, or is it just any web content which I have googled according to ratings not transparent for me as a scientist?Finally the question of conditions of access are important: Are electronic sources in the fields of research and teaching in the respective scientific disciplines accessible free or for fees (option 4 above)?The well discussed journals' or libraries' crisis has its roots here, and it is especially virulent in the fields dominated by large academic publishers.
To sum up the argument in this paragraph: We find some evidence that in many fields of scientific information supply does not fit demand in its quality (option 2 in the model above), quality in this context having a twofold meaning: Quality concerns on the one hand the quality of contents, discussed in the second group; on the other hand, it means quality of the supply mode as described in the first group of questions and considerations.Finally, we have many cases where supply is too expensive in time or in money terms.

Some strategic consequences
Seen in the context of the evolving new informational capitalism and the accompagning network society sketched at the beginning of this article scientific work based on digital sources and a respective instrumentarium is of vital importance for the future of science and of work, in the academic sphere as well as in the private economy and administration.Science is conducted in more or less competitive contexts.Since digital networking is a condition of productivity in both areas it will be enforced on or adopted by acting scientists increasingly.Growing parts of scientific work can be conducted -because of the character of its objects and its methods -only in informatized form.So, dealing with the development of scientific work and the usage of digital scientific information is discussing the future of science and of work in a changing society, their conditions and their chances.
We have found several instances of a mismatch between the supply of scientific information services and the needs and working habits of users and producers.It is worthwile to improve the motivation and quality on both sides.As a general rule for the development of information systems one should proclaim the formula "picking the user up where he is".This is not just shrewd tactics to find support for a system but a responsibility of developers and providers of information services deciding on success or failure of their work, i.e. on the quality of their product.
One more specific consequence of this general formula is to adjust information systems to their respective user communities.This presupposes knowledge on these communities, especially on their way of communicating and cooperating and their use of technologies in doing so.One the one hand serious research is needed to gain information on this unknown area on the map of science.But on the other hand, below that research level, every systems' designer should explore the community to secure success and quality of his or her work.In their recent programmatic statement on "Rethinking Scholarly Communication.Building the System that Scholars Deserve" Herbert Van de Sompel and his colleagues (Van de Sompel et al. 2004) formulate this principle.Since working within an electronic environment is a social setting and not just a question of improving technological efficiency, collaboration with the user (and also the user als producer) is essential.User orientation has two complementary meanings: First to let the user take influence to shape systems according to his or her work needs; second to make working with scientific information systems a necessary and useful part of everyday scientific work, beginning with school and study practices.
To allow the user to influence your system it has to contain bottom-up structures because that is the only way to have a built-in reaction to changes in user's work habits and needs and, furthermore, to make information work as part of his or her everyday business.One should avoid erecting a wall of abstract contradictions between centralized and decentralized tasks and structures.On one hand the acting scientist will best know his environment and what he needs to optimally work in it.So bottom-up structures are not just in the user's interest but in the interest of the efficiency of the whole system (cf.Meier et al. 2003).On the other hand, professional information and documentation does not belong to the normal education of any scientist; so he or she will need assistance in the information area, e.g.advice in getting to know all the world-wide information resources in their field, help in quality assurance of contents, let alone technical assistance.
Most curricula in higher education are not yet up to the new role of electronic scientific information, or to put it a bit more dramatic: The vast majority of curricula is adequate only to a past world of information and in this sense partly obsolete.Formulated positively: It should be an obligatory part of every scientific study to teach and be taught the information dimensions of your respective scientific disciplines.Getting to know resources and services, learning to handle the modern instrumentarium, and becoming able to deal with the heterogeneity of information sources, especially the side-by-side of printed and electronic material, but also of high-quality and googled contents, is an essential qualification for today's scientific work.
Last, but not least: There have to be found new complementary forms of access to scientific contents.Neither the "free-for-all" approach nor the monopolization of whole scientific areas by few academic publishers are a long-term viable future.In practice, the approach of electronic free pre-print and later publication in a printed journal, increasingly paralleled by e-versions of the publication, has evolved to be a model frequently used.Besides, publishers are experimenting with new regulations for parallel print and electronic publication.On the other hand, on-line publications and new models for their organization are spreading.This field seems to become today rather one of experimental projects than one of principal controversies (cf.Henry 2003).
Digital scientific information, its sources, its tools, its services, and especially its relationship to acting scientists' work is an area of experiments and tentative developments.It is to a certain extent neglected by scientific research because it is still considered by most scientists as an area of minor interest, as a sphere of instruments, service and background technology.It is time for all sides to realize that it has become an integral part of original scientific work and has to be taken as serious as the theoretical, methodological and applied dimensions of any scientific discipline.If scientists themselves do not understand these fundamental changes in their work economy and society will force them in probably rather unsubtle ways to realize and to comply with these basic changes.