|Table of Contents|
Front Cover 1
Front Cover 2
Title Page 1
Title Page 2
Table of Contents
Developing the infrastructure for social science computing in Latin America: Strategy and tactics
The international data library and research service at the University of California
Social science data access problems: A brief description of the Wisconsin experience
The ICPR as a resource for social research: Data developments relevant to Latin American studies
Problems of data acquisition in Latin America: The Roper Public Opinion Research Center
Toward a wider diffusion of data available in academically-oriented data archives: The Latin American data bank at the University of Florida
Banco de datos de CELADE
IBRD data Systems
The international statistical programs center of the U.S. Bureau of the Census
|TA BANKS AND ARCHIVES FOR SOCIAL SCIENCE RESEARCH ON LATIN AMERICA
Edited by William G. Tyler University of Florida
JNSORTIUM OF LATIN AMERICAN STUDIES PROGRAMS ilication No. 6 1975
CONSORTIUM OF LATIN AMERICAN STUDIES PROGRAMS (CLASP)
The Consortium is the national organization of institutions of higher education offering study related to Latin America. Formed in the fall of 1968, the Consortium provides the institutional dimension for the realization of the educational purposes of the Latin American Studies Association. Cooperative activities are arranged through the Steering Committee of the Consortium, while liaison is maintained through the Executive Secretariat of the Latin American Studies Association which serves both organizations. Annual dues for 1975 are $50.00. Since CLASP is in effect the institutional arm of LASA, CLASP members receive, in addition to CLASP publications, all publications of the Latin American Studies Association, including the Latin American Research Review, the LASA Newsletter, and occasional publications, without an additional charge above Consortium dues
1974 CLASP Steering Committee: Charles A. Hale, chpn. (U. of Iowa); Robert J. Alexander (Rutgers U.); Carl W. Deal (U. of Illinois); John Finan (American U.); Marshall Nason (U. of New Mexico); Mary Ellen Stephenson (Mary Washington Coll.); Philip B. Taylor, Jr. (U. of Houston); Doris J. Turner (Kent State U.); Miriam Williford (Winthrop Coll).
CLASP Publication No. 1: The Current Status of Latin A-merican Studies Programs. ($1.00)
CLASP Publication No. 2: Employment Opportunities for the Latin American Studies Graduate. ($1.00) OUT OF PRINT
CLASP Publication No. 3: Financial Aid for Latin American Studies: A Guide to Funds for Individuals, Groups, and Institutions. ($1.00) OUT OF PRINT
CLASP Publication No. 4: Opportunities for Study in Latin America: A Guide to Group Programs. ($1.00)
(The charge for CLASP Publications 1-4 is $0.25 less each for CLASP and LASA members.)
CLASP Publication No. 5: Latin America: Sights and Sound: A Guide to Motion Pictures and Music for College Courses. ($2.: ($1.50 to CLASP and LASA members)
CLASP Publication No. 6: Data Banks and Archives for Soci Science Research on Latin America. ($7.00) ($3.50 to CLASP and LASA members)
Further information about CLASP and the Latin American Studies Association will be gladly provided by the LASA Secretariat, Box 13362, University Station, Gainesville, Florida 32604. Telephone: (904) 392-0377.
DATA BANKS AND ARCHIVES FOR SOCIAL SCIENCE RESEARCH ON LATIN AMERICA
William G. Tyler University of Florida 1975
CONSORTIUM OF LATIN AMERICAN STUDIES PROGRAMS CLASP Publication No. 6
TABLE OF CONTENTS
Chapter 1: Developing the Infrastructure for Social Science Computing in Latin America: Strategy and Tactics
David Nasatir 1
Chapter 2: The International Data Library and Research Service at the University of California
IDL&RS Staff 7
Chapter 3: Social Science Data Access Problems: A Brief Description of the Wisconsin Experience
Alice Robbin 11
Appendix: Latin American Holdings of the
University of 'Wisconsin Data and Program
Library Service 25
Chapter 4: The ICPR as a Resource for Social Research: Data Developments Relevant to Latin American Studies
Richard I. Hofferbert 31
Appendix: Data Sets with Latin American
Content in the ICPR 41
Chapter 5: Problems of Data Acquisition in Latin
America: The Roper Public Opinion Research Center
Philip K. Hastings and H. Jon Rosenbaum 70
Appendix A: Inventory of Latin American Surveys at the Roper Center�As of May, 1973 85 Appendix B: Sample Question Index Cards 102 Appendix C: Selected Bibliography of
Publications Based on Roper Center Data 103
Chapter 6: Toward a Wider Diffusion of Data Available in Academically-Oriented Data Archives: The Latin American Data Bank at the University of Florida 104
Manuel J. Carvajal 104
Chapter 7: Banco de Datos de CELADE
CELADE Staff 114
Chapter 8: IBRD Data Systems
IBRD Staff 119
Chapter 9: The International Statistical Programs Center of the U.S. Bureau of the Census
Carlingford Gray, Jr. 122
A number of recent developments have contributed to the growing importance of data banks and archives for social science research. The availability of the computer, machine-readable information produced by governmental organizations, and growing survey research materials, along with an increasing demand for and emphasis on quantitative research in fields of inquiry related to social phenomena, have all served to focus attention on the machine-readable data archive as an important tool for the researcher. As a response to the increasing demand for statistical information, a great number of data banks and archives have been established, and recent years have seen their holdings grow exponentially. Of these data facilities there are a number which possess information dealing with Latin America and which are of interest to social scientists doing research in or on Latin America. This volume is an attempt to provide the researcher with information on some such data archives . 1
The volume, consisting of contributed articles by representatives of the major data banks with extensive Latin American holdings, has grown out of a Consortium of Latin American Studies Programs (CLASP) sponsored session at the May, 1973, Latin American Studies Association (LASA) Fourth National Meeting held in Madison, Wisconsin. The purpose' of the session was threefold: (1) to provide LASA members with information as to the activities and data holdings of the represented data archives; (2) to increase utilization of the data facilities, all of which have been underutilized; and (3) to offer a forum for the informal exchange of ideas and a basis for the further contact and collaboration of the representatives of the various data archives. Since the May, 1973, meeting some additional papers have been included, and materials have been updated. The final product, presented in this volume, does not pretend to have included all the data facilities with important Latin American holdings. In fact, apologies are expressed for sins of omission; it was not possible to make the listing more extensive. Our
AA complementary and valuable survey, with a useful orientation to the actual uses of machine readable data in various disciplines, is contained in Robert S. Byars and Joseph L. Love, eds., Quantitative Social Science Research on Latin America (Urbana; University of Illinois Press, 1973).
emphasis has been mainly on North American institutions, particularly university-affiliated data archives. Even so, it is our hope that this modest collection of papers can usefully serve as a point of departure for social scientists interested in machine-readable data dealing with Latin America.
Most of the papers dealing with specific data archives present some information as to holdings. While exhaustive data listings have not been included, in all cases the relevant addresses have been provided. Those readers interested in the operations or holdings of a particular data bank are encouraged to enter directly into contact with the institution(s) in question.
In examining data banks and archives pertinent for social science research on Latin America, two separate approaches are possible. First, discussion can proceed as to the many problems of the data facilities per se. The organization of data banks, technical problems of data retrieval and accessing, software packages for data analysis, data pooling arrangements, protection of confidentiality, data acquisition difficulties and priorities, and financial problems all could serve as the focus of discussion of data banking. To be sure, all of these issues are of importance. Moreover, as data banks gradually exercise greater importance in scholarly research, more attention will have to be devoted to the problems of their operation. In our opinion, data banks, either existing or yet un-established, will play an important role in future scholarly research dealing with Latin America. Even at the present time, a large data base exists and is available, and modern techniques of analysis can go a long way in making more effective research use of the data base. In the future it is conceivable that machine-readable data libraries will play as important a role in social science research as the traditional book libraries. The computer opens up new vistas, and the Latin Americanist should make effective use of it in carrying out his own research.
A second approach to the subject of data banking is to concentrate on user information. Rather than discuss data banking problems per se, attention should be devoted to problems that potential users may encounter. Information can be provided to the researcher as to the existing data banks and how they may possibly serve his research interests and needs. This volume mainly concentrates on these questions.
DEVELOPING THE INFRASTRUCTURE FOR SOCIAL SCIENCE COMPUTING IN LATIN AMERICA: STRATEGY AND TACTICS
David Nasatir, University of California, Berkeley
FOR SEVERAL YEARS, NOW, IT HAS BEEN POSSIBLE TO OBTAIN THE DATA from at least forty surveys conducted in Latin America over the last 15 years by some of the world's outstanding scholars and research organizations. The data is already coded, punched on cards and resident on magnetic tape. It is suitable for analysis by almost any computer. The methods utilized and the coding schemes employed are fully documented. Many of the studies are part of the multi-national comparative designs. They are on a variety of interesting topics: fertility, mobility, politics, aspirations, values, attitudes and behaviors. The entire package, ready for analysis, can be obtained for about $250�or less if only part of the collection is desired. Yet very few individuals or institutions have taken advantage of this opportunity and almost none of them are from Latin America.
There are many reasons why the resources of social science data archives are not very well utilized but these reasons can be rather easily classified into two large categories: social and technical barriers.
There are several barriers of this type that must be overcome before a scholar can really make use of social science data archives. In the first place, he must know of their existence, and their contents. Although there are more than two dozen institutions in the U.S. devoted to the collection and rediffusion of machine readable social science data�at least two of which have substantial holdings from Latin America�and several more in both Europe and Latin America, there is no single organization to represent this group. There is no place to turn for a complete and up-to-date list of the archives, their locations, the nature of their holdings, their policies of data acquisition and diffusion and the names of their key personnel. The best device for overcoming this barrier, at the moment, is a careful review of the journals Social Science Information, published by UNESCO; and Social Science Data, published at the University of Iowa. The Latin Americanist would not find out too much from this review however as there is no general list available of centers with a particular interest in that region. Even if a perceptive scholar
should make contact with the archives at Berkeley or Florida (to cite two facilities with substantial holdings from Latin Americ; the indexing of the collections leaves something to be desired. Letters, phone calls or even a visit are required to ascertain i the data on hand will actually suit the purposes of the research At Berkeley, at least, an initial charge of ten dollars is made just to provide a catalogue of the archive's holdings and an investigation of the documentation to see if there may be materia] of interest to the potential user.
Even if a scholar knows what he wants and where it is, poli tical, fiscal and physical difficulties may make data acquisitic difficult. Not all data sets are available to everyone. Some archives require the potential user to obtain permission from the original source of the data. Even where this has been obtained, there may be a substantial cost associated with getting a copy c the materials desired. This is usually true if the data set has never been previously requested and requires special treatment ("cleaning") before it can be rediffused. Normally, the total cost of making a copy of the most complicated data set should be less than fifty dollars. Getting the dollars�if one is in Lati America�may prove difficult and most archives have difficulty handling problems of foreign exchange. If the fiscal problem is solved, the physical may still remain. Data on cards for a substantial study may weigh over fifty pounds. Postage costs may then exceed the costs of acquisition. If the data is shipped on magnetic tape, however, great care must be exercised to assui it is written in a form compatable with the facilities of the user, not erased in transit, and that it will not languish over! long in a customs shed.
Whether or not it is easy to obtain data, local customs ma; make it difficult to exploit it. For many, the fascination wit! data collection substitutes for the excitement of analysis and : the data are not "new" they are, somehow, not good. Unless a local "style" compatable with secondary analysis can be establi: ed, the availability of data will be no guarantee of its use.
TECHNICAL BARRIERS: HARDWARE
The absence of a tradition of secondary analysis is frequei linked with the total absence of quantitative analysis. This i; often due to a lack of both the technical and organizational facilities required to facilitate the analysis of sample surveys containing hundreds of responses to dozens of items. The rapid evolution of data processing technology has led to a situation where few scholars have access to a complete unit record shop (keypunch, reproducer, interpreter, counter-sorter and statist! or accounting machine) of the kind required for the style of analysis described in the older texts on the subject, nor a large, interactive computer center to carry out the analytic pr cedures described in the most current texts. To complicate mat ters, the evolution of the technical infrastructure is rarely controlled by the research scholar and he often discovers that the acquisition of newer or "better" equipment has actually mad
it more difficult, more time-consuming or more expensive for him to get his work done. This paradox arises for the most part from an early commitment to unit record equipment leading to the use of multiply-punched cards that cannot be handled easily within most computing environments. Occasionally difficulties are encountered due to incompatibilities among different types of machines�ninety versus eighty column cards, for example; round holes instead of rectangular ones; or tapes written at one installation that are unreadable at another due to different density or track capabilities. There have even been instances where materials prepared for one series or generation of computers could not be used without extensive modifications for more modern replacements made by the same company.
Above all, there is the problem of cost. Data processing by computer can be very time consuming. The usual device for setting priorities for the use of the machine is by charging for "time" but precisely how that "time" is to be calculated is a matter of local policy. In most university environments, computing policy is often established by those interested in heavy computation with a small amount of data to be read in or printed out. The cost for social science data processing in such circumstances may prove a truly insurmountable barrier.
TECHNICAL BARRIERS: SOFTWARE
Even an abacus is of little utility if you don't know the procedures for adding and subtracting with it. Getting the equipment to process data in the manner desired is often more difficult than getting the data or obtaining access to the machines for processing it. Unfortunately, computer programming is not a skill that is widespread among data analysts and the attendant dependence on others can be very frustrating. Fortunately, there are relatively few different types of procedures employed by the majority of analysts and several programs and program packages have been developed to carry out these procedures. Not every program will run at every installation, of course, nor does any package contain all the programs desired by every analyst. But it is not necessary for the analyst to be a programmer�only to know enough about what he wants to do to be able to choose the proper programs and packages .
Being able to make such a choice, however, usually requires some knowledge of what will work on the local facilities. It must be possible to consider the strengths and weaknesses of different program packages from a technical as well as an analytic perspective. This kind of information emerges best from situations in which there is a sustained analyst-technician interaction and that only occurs in a few organizational settings.
All too frequently, data processing machinery is located within an organization unfamiliar with the demands of social science research. Either the organization is oriented toward arduous tasks of computation or, if it does any data processing,it is of the ad-
ministrative accounting and payroll type. A request for the combination of several variables to create an index, collapsing of categories and cross tabulation of two or more variables can not be handled routinely in such a setting. Each of the ordinary steps of the analyst of social science data is viewed as a major, troublesome, costly endeavor. Such requests are rarely met with good grace or efficiency.
STRATEGY AND TACTICS FOR OVERCOMING THE BARRIERS�SOCIAL
There are three steps that may be taken immediately to faci] itate greater use of existing facilities. First of all, it is es sential that someone be commissioned to produce an annotated guide to the location of studies on Latin America already available in machine readable form. Several attempts have been made to create a universal catalogue but they have not been successful due, in part, to the magnitude of the task, lack of focus, ai inadequate financing. A clearly delineated constituency exists for the Latin American studies and the creation of this annotatei guide, or inventory,is not an end in itself but a preliminary to the second step.
With the inventory well underway, efforts to acquire copies of the materials can be undertaken. This should be done in ordei to develop a core collection to be distributed widely�at least one complete copy for every country in Latin America. Total cosl of such a venture might come to $25,000, including the production of the inventory, the "instant data bank" and its distribution.
With the core collection or instant data bank on hand local! it is possible to undertake the third and most essential step: training potential users. This can best be accomplished by a pn gram of activities conducted nationally or regionally where locai scholars work�with technical assistance�to develop the organiz: tional arrangements that would permit use of the instant data bai All of this might take place under the rubric of a training semii devoted to secondary analysis. The curriculum of these seminars could focus, at first, on comparative analyses of data from the host country with data from other countries. Such analyses are of great local interest and immediately focus on the skills of descriptive studies. Analytical efforts seeking to elaborate models, generate hypotheses and test for the robustness of relat ships will soon follow.
National or regional seminars bring together a "critical mi of interested scholars. They provide an opportunity for sustain! interaction and legitimation for the work of the secondary analyi And they provide the nucleus for an organizational infrastructun to facilitate quantitative research in Latin America. More about this a little farther on.
Problems associated with both machinery and programs were mentioned earlier as barriers to be overcome for full utilizatioi of data archives. It is often the case that there is more machii locally available than many analysts realize. In addition to
academic (and governmental) facilities, commercial facilities are often available. Computers are a large capital investment. Most organizations that have them are eager that they be utilized as much as possible. This means twenty-four hours a day, seven days a week in many instances. Purchase of services from these sources is often possible.
There is a need, in any setting, to have a good knowledge of the computing options available. This is particularly true in Latin America. To obtain this knowledge, the workshop sessions described earlier should be preceded by a carefully conducted survey of local (or regional) facilities. In addition to facilitating the work of local analysts, the information from such a survey provides a basis for developing a rational policy for the acquisition of new computing machinery. A long-term strategy should be developed to promote compatability among local installations and between local and "foreign" installations. The social scientists should assure their part in the development of this strategy by playing an active role in determinations of the current situation in their region and by articulating the directions in which they expect to move. These activities promote the legitimacy of later requests to share the costs of constructing a computing infrastructure that will serve the needs of the social science community.
Like most things, computers will not function as the analyst desires unless the analyst can express his needs in terms comprehensible to the machine. The difficulty of this task is appreciated more fully when we recall the problems associated with getting another person to comprehend our desires. In point of fact, communicating with the machine is often easier than communicating with someone else; especially when dealing with sophisticated concepts from the social sciences.
It is far easier to train an analyst to deal with machines than to train those familiar with computing to be social scientists. This is particularly true now that a variety of program packages are readily available. It is no longer essential that the analyst be able to write his own programs. In most cases it is merely a matter of learning how to use already existing programs.
A central part of the workshop program, then, would consist of making program packages operational in the local environment and teaching scholars how to exploit these packages to meet their analytical needs. With some experience in this area, it is relatively easy to become a sophisticated consumer; one capable of comparing and evaluating new packages as they appear and translating their requisites into the local scene.
It is difficult, if not impossible, for an individual to work entirely alone in the analysis of social science data by machine. It is, by its nature, a collaborative venture. An organization for this purpose is an absolute necessity. Not only does an organization provide a physical locus for the research activities, it provides the setting for the development and maintenance of a social
science computing culture. It provides a location for recruiting and involving computing and analysis people in a common endeavor. It provides the setting for mutual stimulation intellectually, and creates a hierarchy of roles. This hierarchy creates new career patterns, both technical and analytical. Without the possibility of mobility within the organization, skilled and talented personnel will not stay and each analyst will have to recreate the neces sary arrangements for himself.
An organization of data analysts and technical support person nel is essential to provide guidance for new acquisitions and the growth of the initial core collection or instant data bank. Resou ces are always scarce. There is always more available than will be used. Setting the priorities for new acquisitions must stem directly from analytical needs and these cannot be articulated wit out an organization of users.
Once the local organization is underway, it is possible for i to exert a positive influence on research conducted in the local region. Not only can it provide a repository for new data, it can, by proper exercise of influence, require that data submitted for archiving meet certain minimal standards of methodology.
Documentation, data cleaning, and allied tasks are costly act ities. Both the original source and later users of the data benefit when the original source performs these services prior to arct val acquisition.
Policies that promote rediffusion of data should be encouragf All data accepted by an archive should, after some arbitrary peril become available to any user for any purpose. The alternatives are not only undesirable they are, in practice, unenforceable.
Finally, we must consider the costs of such an enterprise. I though users of a local archive should bear the marginal costs in curred to serve their peculiar needs (e.g. making a copy of the data set) the basic operations of the organization's activities mi be independent of the utilization of the data. Every attempt to finance operations by charging for the use of data has resulted in restricting rather than promoting the free flow of information,
Unfortunately it has been difficult to obtain the type of funding required. Only where groups of scholars and institutions have banded together to form collaborative ventures has there beei any success. If we hope to promote the development of secondary analysis of machine readable data by scholars interested and loca-ted in Latin America, it is time for us to do the same.
THE INTERNATIONAL DATA LIBRARY AND REFERENCE SERVICE OF THE UNIVERSITY OF CALIFORNIA*
SINCE 1961 THE SURVEY RESEARCH CENTER HAS BEEN DEVELOPING AND maintaining an International Data Library and Reference Service (IDL&RS). This facility is designed to assist social scientists in obtaining, processing and analyzing existing survey materials. Since its inception this organization has been acquiring and cataloging basic data from important American and foreign surveys of national, regional, local and special populations. Through a National Science Foundation grant special efforts were made to obtain survey materials from developing nations of Asia and Latin America. An extensive collection of original questionnaires, codebooks, and data on cards and tape has been developed to make these materials available for further scholarly analysis.
The purpose of the Data Library is to open these studies to wider use in research and in student training. Data which would cost many thousands of dollars to collect anew is here available for the relatively small cost of duplication. Costs for acquiring materials for a given study are dependent upon the size of the sample, documentation and condition of the data. Users who require more specific manipulation of data than simple duplication will find that costs rise considerably. The breakdown of cost categories is: codebooks, 10c per page; one 2400' computer tape, $12 each; labor, $10 per hour (generally one hour time is sufficient for simple requests); computer time (approximately $10 per 2,000 punched cards and $5 per 2,000 card images on tape); postage and handling (generally less than $5 per study unless the user specifies airmail). Special multiple-file tapes have been compiled with studies from specific areas of the world, such as the Latin American Data Bank. They are available at standard prices which enable the user to acquire large numbers of studies at very low costs per study.
From its founding in 1961 as a consequence of the convergence of interests of a group of scholars at the University of California, the International Data Library and Reference Service (IDL&RS) has been dedicated to facilitating the social sciences in developing countries, especially in Latin America. These activities were reinforced by funds from the National Science Foundation during the period 1964-1970 for the establishment at Berkeley of an archive *Prepared by the staff of the International Data Library and Reference Service.
of survey data collected in Asia, Africa, and Latin America. Thi IDL&RS was able to obtain the collaboration of many scholars in obtaining high quality survey materials from these areas and has been instrumental in making these materials available to individuals and archives throughout the world in the form of IBM cards, magnetic tapes and documentation.
The major goal of the IDL&RS has been to provide the highest quality materials for instruction and research at the lowest pos sible cost. Thus the IDL&RS has been careful to accept materiali for archiving only if they met certain minimal technical and metl ological standards. The instruments, sampling and data gatherinj techniques as well as the theoretical relevance of potential donations were carefully reviewed by the Archive staff.
Studies deposited in the archive were "cleaned" (i.e. all discrepancies between data and documentation were resolved or no ted) and placed on magnetic tape. A machine readable version of the documentation was also prepared that included the questions, instructions to interviewers, answer categories and the frequenq distribution of responses. The inspection of these publications permits the interested investigator to solicit a deck of cards or magnetic tape containing only those data relevant to his pro-[ ject.
Unfortunately, no funds were available for continuing the work of the IDL&RS after 1970 and so no further development of tl Latin American collection has taken place. It is possible, however, to obtain any of the studies currently available for the costs of duplication and mailing (usually less than $50 per stud] Thirty-three of the more popular studies have been prepared for shipment as a package to those interested in establishing an "in stant" data bank on Latin America. Titles of these studies and the countries in which they were carried out are given below. A more extensive listing and description of studies in the IDL&RS Latin American collection can be obtained by contacting the Data Librarian.
It is no longer feasible for the Data Library to respond fr ly to the many hundreds of requests received annually. Individu requesting information will now have to bear the cost of providi it. Please accompany all requests for information about the Dat Library and its holdings with a check for ten dollars ($10) payable to "The Regents of the University of California." In retur for this nominal fee, you will receive (1) an up-to-date catalog of studies in the collection (currently 98 pages covering over 1 studies), (2) copies of relevant study descriptions when availab and (3) a letter detailing other information brought to light by a short but thorough personalized search. Inquires should be di rected to John Lawson, Data Librarian, International Data Librai and Reference Service, Survey Research Center, University of Cal ifornia, Berkeley, California 94720.
Selected IDL&RS Holdings on Latin America
Attitudes of Cubans
Career Values in Mexico
University Students-Values, Vocational + Political Orientations Panama:
University Students-Values, Vocational + Political Orientations Puerto Rico:
University Students-Values, Vocational + Political Orientations Family and Population Control
Role of University in Development of Political Consensus Stratification + Mobility in Four Latin American Cities Theory of the First Impact of Economic Development Voting Opinions Voting Attitudes Political Apathy in Rosario
Law School Students Career Values in Brazil University Students Images of the U.S.
University Students-Values, Vocational + Political Orientations An Aggregate Data Bank + Indices of Brazil
World Survey II Attitudes Toward Domestic and Foreign Affairs Voting Attitudes in Rio
Stratification + Mobility in Four Latin American Cities
Personnel Managers Union Leaders
Students at State U of Santiago, U of Concepcion + U of Temuco Agrarian Reform in Chile
Stratification + Mobility in Four Latin American Cities Political Behavior
Attitudes and Opinions Towards Education and Work lombia:
University Students-Values, Vocational + Political Orientations Personality Disorganization Among Refugees of the Violencia
University Students-Values, Vocational + Political Orientat Uruguay:
University Students-Values, Vocational + Political Orientat Venezuela:
Attitudes of Students at La Salle, A Catholic School in Cai Prestige of Latin American Nations According to Students
SOCIAL SCIENCE DATA ACCESS PROBLEMS: A BRIEF DESCRIPTION OF THE WISCONSIN EXPERIENCE
Alice Robbin, Data and Program Library Service, University of Wisconsin, Madison
OVER THE LAST FEW DECADES THE VARIETY OF INFORMATION IN MACHINE-readable form has grown exponentially. Government agencies, research institutes, marketing firms and private individuals throughout the world have designed and continue to design studies to secure completely new data in order to test hypotheses. The large number of projects, new and old, has resulted in heavy costs in personnel and money, an extraordinary investment of time and gathering and organizing data, a general shortage of information about current research, the problem of adequate retrieval of the data gathered by these projects, and the necessity of writing new computer programs for data analysis. It is therefore not too surprising that during the last decade we have witnessed a change in research strategy. A growing number of scholars in the social sciences have recognized the importance of preserving data and computer programs which could have significant subsequent value for other researchers. The development of facilities for acquiring, storing and maintaining the increasing quantity of available machine-readable social science data, and computer analysis and data management programs was the logical outcome of this strategy.
At the University of Wisconsin, such a facility, called the Social Science Data and Program Library Service (DPLS), was established in 1966, with the support of social scientists, especially in the departments of economics, political science and sociology, and The Graduate School. As a local facility, DPLS has as its major functions:
(1) the acquisition, storage and maintenance of data files from all the social sciences, as those files become available for secondary analysis�either from individual researchers, from other local archives or through national social science data repositories. The latter include especially the International Survey Library Association (ISLA) archive at the Roper Public Opinion Research Center in Massachusetts and the Inter-University Consortium for Political Research (ICPR) at the University of Michigan;
(2) the providing of cleaned and well documented data files
for members of the university community, at as low a cost as possible, for use in faculty and graduate student's research and for training in research techniques as part of regular classwork;
(3) the providing of individual and classroom instruction on where to look for appropriate research data, how to acquire them, how to use them efficiently, and how to interface the data with existing software and hardware;
(4) the acquisition, storage and maintenance in "debugged" and well documented form for use with data in the Data Library and for other research purposes ;
(5) the providing of program consultants for assistance in the selection and use of statistical programs;
(6) the providing of unit record and related equipment for processing the data in conjunction with the University of Wisconsin, Madison Academic Computing Center; and,
(7) the establishment and maintenance of relationships with researchers and other data suppliers outside the University.
Since 1966, when we began a Data Library with seven data files donated by the Department of Economics, and a Program Library with only a few programs, DPLS has acquired about 500 studies containing national and cross-national historical, economic, political administrative, census and electoral records, opinion and attitudinal surveys, and biographical information. Growth of the Program Library has been even more marked: close to 1000 statistical programs and routines. Xo further the cooperative use of social science oriented software, a decision was made in the late 1960's to become the central facility for information about and access to computer programs located at other installations. With the support of the National Science Foundation, the National Program Library and Central Program Inventory Service (NPL/CPIS) was established in 1969. From 1969 to 1972, NPL/CPIS undertook to inventory and abstract existing statistical programs throughout the United States and elsewhere, to employ an interactive retrieval system to locate programs, and to supply users outside the University of Wisconsin system with programs and program documentation.
During these years, we have been faced with a number of complex problems. Some of these have included access to data and programs, allocation priorities, acquisition priorities, recruitment of personnel, cooperative arrangements with organizations within the university structure and with other archives, dissemination of information about the archive's resources, under-utilizatlon of the archive's resources, assistance to data gathering agencies, and financing of the archive. Because it is not possible to discuss in detail all or even some of these problems, this paper will concern itself with a brief descrip-
tion of five problems of data access: (1) locating data, (2) retrieval of data, (3) quality of the data collection, (4) restrictions on dissemination, and (5) exportability of the data storage medium.
I. LOCATING DATA
Locating data is a major problem for the archive staff. The archive staff is continually confronted with a cumbersome and inadequate system. Not only are most of the secondary systems not inventoried, thus making it nearly impossible for anyone, save the cognoscenti within a particular discipline, to have knowledge of the data's existence; but, although many archives exist, there is no centralized facility which provides the local service archive with information about data holdings of other repositories of data. The one journal devoted to reporting data holdings of archives in this country and abroad, S S DATA: A Newsletter of Social Science Archival Acquisition, must depend on the local archive to send a description of its data files.which may be publicly disseminated. Local archives, in most cases, do not employ a standardized set of content descriptors which are understood by all using machine-readable data. The problem of a lack of standardized descriptors is compounded because social scientists in different disciplines employ different terms to describe the same phenomenon. Further, there is no agreement about how to provide (i.e., in what form) the data archivist with information about the contents of data collections (although recommendations have been formulated and are now circulating around the data repository community).^ In other words, what data archivists do not have is a system of cataloguing similar to the Library of Congress system (which, although some consider to be outdated and somewhat unresponsive to changes in our technological society, still provides a standardized method of describing and categorizing information). Even if the LC system were adopted by data archivists, it would not adequately describe a data collection because a user most often needs a specific set of variables measured for a particular population during a specified time period. (While the time period and population could be more easily and completely described, it would be quite difficult to describe satisfactorily and in the detail needed by a researcher, the set of variables contained in the study.)
Because there is no centralized facility which provides information about data, archive staffs presently devote an extraordinary, percentage of time to "tracking down" sources of data for a potential user. Because few archives publish summaries of their holdings (which facilitate, at least at a superficial level, the locating of data), one is forced to spend precious hours either on the telephone or in correspondence with an archive. Data archive staffs are, therefore-, dependent on several things in order to locate data: extensive, broad reading of publications of the social science disciplines, collections of
bibliographies citing publications based on analyses of machine-readable data, subscriptions to the relatively limited number of journals which discuss archive holdings and current and completed projects which involve data collection, knowledge of faculty members and graduate students specializing in particular disciplines who can be relied upon to suggest data sources, and an informal communications network among archivists.
Researchers are all too aware of the proliferation of journal and journal articles within their respective disciplines and even' within their particular area of expertise. We are familiar witM the lament of the researcher: the impossible task of keeping up with the continuing explosion of resources at his disposal. Yet, the staff of a general purpose archive responsible for collecting machine-readable data generated by all the social science disciplines is increasingly called upon to assume full responsibility for knowledge of the existence of social science data collections It is generally expected that the data archive staff will have information about data held by other archives, data generated by individual researchers, the federal governments of this country and abroad, private institutes and professional polling agencies Given that a broad in-depth knowledge of social science is impossible, data archivists tend to rely on the limited number of collections of published and mimeographed bibliographies, journa which cite publications based on analyses of machine-readable data and which make provisions for information exchange, such as the International Social Science Journal, Social Science Information, and The Historical Methods Newsletter, and journals and completed projects which involve data collections, such as S S DATA: A Newsletter of Social Science Archival Acquisitions, Survey Notes (a publication of the Survey Research Laboratory at the University of Illinois, and the Newsletters of the Institute of Social Research of the University of Michigan and Temple University.
That the data archivist depends upon face-to-face communica tion with faculty members and graduate students should be of no surprise to anyone. These are the individuals most knowledgeable in their discipline. The major difficulty, however, is that it requires several years at an institution in order to know whom to call for advice. Our experience at DPLS serves as a good example: Some time ago we received a telephone call from a faculty member in the School of Business. He was preparing to con-| duct a survey among middle-management people in industry and was searching for questions on role perception. He wondered whether we held any surveys (or knew of any) related to this area. A search of the data holdings and documentation revealed nothing. One of the staff at DPLS had done graduate work in political science and had taken a course under a professor whose major interest was role theory. A call to him was made and he suggested that the individual most knowledgeable in the area of role perception and survey research was a professor in the Department of Psychology. We were then able to refer the faculty member at th School of Business to the professor in the Department of Psy-
chology. While we could and indeed did provide a valuable reference service for this individual, this description illustrates the difficulty of locating data which no doubt existed but of which the data staff had no knowledge.
A promising mode of organization communication among archives was the Council for Social Science Data Archives (CSSDA). The CSSDA was in existence only a few years and the formal network of communication that it created existed mainly among the the directors of major archives. While most data archives came into existence after the demise of CSSDA, the Council, nevertheless, provided enough initiative and enthusiasm among the individuals actually responsible for implementing some of its recommendations in order to maintain an informal communications network. This informal system has meant, most importantly, that archivists at one repository have maintained an interest in the subject needs of other repositories and are still somewhat able to communicate new data sources when they become available at some institutions. Unfortunately, this informal network of communication among archivists exists only among the larger archives which service not only their local clientele, but individuals outside their institution, and only among those archivists who were active during the years of the Council for Social Science Data Archives. The smaller archives (generally, university campus archives) depend upon what is available to them through their membership in the Inter-University Consortium for Political Research.
By now, it must appear to the reader that the data archivist is "back in the Stone Age" at a time when he/she is surrounded by third generation computers which have helped us reach the moon. And if we have reached the moon, why have we not employed the same technology to locate data here on earth? The situation of data archives does appear to approach the ridiculous. But, considering that data repositories have computers at their disposal and that their staffs make use of these computers every day, that no efficient system of locating data has actually been implemented must demonstrate the difficulties of employing the computer to reduce search time.2
II. RETRIEVAL OF DATA
It would be well at this point to turn our attention to a second problem of access, that of the retrieval of data. There is no doubt that the greater the number of studies stored, the more urgent the problem of adequate retrieval systems, both for the archive staff and for their clients. The archive may have quality data, but if the archive is to provide a useful service, it must do it efficiently. However, what we have learned over the last decade is that information retrieval via computer is not so easily operationalized.
We have learned that users have different needs and there must be great flexibility in the kinds of output optionally a-vailable: Some users require complete texts of codebooks for
studies. Others need brief descriptions of all studies in a repository which are relevant to their analysis. Others may want a list of analytic concepts which are operationalized in a given set of studies. Still others want to tie discrete collections together and retrieve them physically together. Others want to retrieve individual records. Still others need only specified variables. A user at one computer installation wants to query the holdings of a repository which stores its data in another type of computer. A data repository wishes to export the retrieval system to another installation.
There are two aspects to the retrieval problem, the technological and the methodological. The technological involves the computer environment, storage medium, types of files (rectangulai time series and hierarchical), and level of retrieval (file(s), records, and variables). The methodological aspect involves abstracting, classification systems (keyword listings, dictionaries and thesauri), conceptual orientations, and standards. Some woul argue that the problems involving the environment, storage mediun rectangular and time series files, and classifications systems have already been solved. Others would argue that these problems have not been solved satisfactorily. In addition to these problems, we face difficult problems with retrieving hierarchical files and creating a system flexible enough to retrieve files, records, and variables. While we have, for example, developed classification systems, there is no agreement on the content of the keywork lists, dictionaries, and thesauri. Further, dictionaries and thesauri must incorporate the specialized research vocabularies of each discipline or area studies (e.g., European or Latin American studies), but the increasing specialization within disciplines has complicated the development of these retrieval mechanisms. There is also limited agreement on the standards to be set for the contents of the classification systems .
Strenuous efforts to develop retrieval systems have been mad over the last decade, but the complex problem of retrieval remains unsolved. Let me give two examples of data retrieval methoi presently being employed and some of the difficulties encountered! with these methods.
Both the Central Archive for Empirical Social Research at Cologne, and the Roper Public Opinion Research Center at Williams College have attempted to improve retrieval procedures by creating classification and indexing systems for individual question contained in surveys. Users must locate categories or index words which they think cover their problem; then they must perfor manual searches on a large field of questions. In the case of Cologne, a group of sociologists, political scientists, and econo mists proposed a number of master categories (government, economl social groups, states and countries). Sub-categories were added and others eliminated. A continuous adjustment of the classification system occurred. The master categories are now rather systematic, but the sub-categorias have turned out to be small dic-
tionaries related to different fields of current research. The system will probably have around 500 categories when completed. On the other hand, the Roper Center system has 72 master index words, and 1,378 sub-categories. Although the Roper system would seem on the surface to be more easily manipulatable the number of major categories are smaller, the master categories are in reality so broad, as to make the search for particular questions even more difficult.
For the last several years the DPLS staff has been creating one page data abstracts for each data file accessed by the staff. The data abstract describes the data file content in a mix of concepts (e.g., social stratification, social and occupational mobility, political efficacy) and key words (e.g., occupation, race, marital status, age). The abstract also contains descriptions of the source of the data, when the data were collected, universe sampled, type of sample, number of data units, size of the study, storage form, reference materials, publications relating to the data, condition of the data, restrictions on dissemination. The abstract has been used as a preliminary source of information before turning to the codebook itself. It has facilitated search time of the contents of a data file. Nevertheless, search time could probably be further reduced if these abstracts were computer-retrievable.
We intend to input these abstracts into the FAMULUS information retrieval system. A dictionary will be created from the abstract description of the contents of the data files. Since we are a disseminating repository for only about 10 percent of the 500 data studies we hold, we will use these 60-odd studies as a test for reliability, efficiency and cost-effectivene3s of the retrieval system. Querying these abstracts will not produce for the user the precise variables for which he is searching, but it is estimated that the computer should reduce search time by about 40 percent.
Nevertheless, the retrieval system which we intend to use will still not resolve the following problems: A codebook and related documentation for the studies must be requested. The precise variables must be located. Variables and records may need to be extracted from the files. The output (a series of keywords and concepts) will not be a standardized set of descriptors employed by archivists, but rather a set of descriptors in use at DPLS. Abstracts will be too large to be inserted in a library's general catalogue as one method of accessing DPLS's data holdings. Since the system is designed for in-house querying on the UNIVAC 1110, neither it nor its contents are easily exportable, although the system is currently operational on the Control Data Corporation 6400 and 6600, and IBM 460, Model 40 or larger systems. The query could prove expensive, so that DPLS could not afford to keep the system "up and running" for more than the trial period. If the file were not exported and an outside user wished to query the system, access to a teletype and an account with our computing center would be necessary. The outside user would then face the same problems
encountered by an individual here (as described above) in addition to possible restrictions on the data file's dissemination, and use of the data in the medium in which they are stored.
III. QUALITY OF THE DATA COLLECTION
There are few collections of data which arrive at the archive in anything approaching ideal form. The extent that the data have been examined and corrected for coding and keypunching errors and are free from error was dictated by the special research requirements of the study. The data which come into an archive may have been highly appropriate for a particular analys but not necessarily useful for immediate and general distributio Most of the data were, at one time or another, the private domain of an investigator, research group or commercial agency. Rarely were advance preparations made to process the data collection in a form suitable for secondary analysis.
Research needs have usually specified the preserving of only a small amount of the total information originally collected. Rarely are the original protocols accessible to an archive so that archives are in a position to undertake a second content analysis operation. A second content analysis operation could make the study more empirical in coding. Experience has shown that coding all the responses is an asset to the researche doing analysis. Because coding schemes are often not precise an detailed, data archivists face the complaint by their clients th the variables are precisely what they need, but that the coding schemes are inadequate for their research needs. Archivists do not advocate the availability of the original protocols of a survey because confidentiality of the respondent's responses is essential. But, inaccessibility of these protocols makes it almost impossible to verify the data. This means that analysts of secondary data must rely on known parameters for certain variabl and must determine whether the values fall within these parameters .
Data archivists are often unable to ascertain the degree of consistency between and among the variables of a data collection Consistency checking programs are vital in order to determine the accuracy of the data collection. But the development of such programs is a difficult task. These programs must be writt in a simple user-oriented language to allow the user to specify the variables, codes, and values which need to be checked. Few consistency checking programs (which include validity checking) have been developed which meet the needs of the researcher. Efficient processes are needed to set up the computer instructions to perform the actual consistency checks; yet, almost all of the programs are costly and inefficient. Because of the nature of social science data collections with their enormous number of variables, the costs become exceedingly high when verification of the data is attempted. Consistency checking involves a detailed understanding of the data, which few individuals, save tl original investigator, have. As a result, the archive and its
clients rely on the integrity of the data supplier and must perform consistency checking on the few variables needed for data analysis.
The most information that an archive can obtain about the study is a description of the type of sampling, sampling error, response rate, interviewer instructions, coding policies and instructions for those data which are survey data, and sources of published statistics for those data which have been converted from published to machine-readable form. The archive staff, with these sorts of information, is able to make some judgement about the condition of the data, but it is in no position to be much more than neutral in deciding whether a data set is usable for secondary analysis. In one sense, the archive's responsiblity ends upon distribution of the data set. The condition of a data set is the responsibility of the collection agency and the use of the data set is the responsibility of the analyst.
Archiving the data collection once the project has been completed is usually an "after-thought" on the part of the researchers. The "after-thought" sometimes occurs when office space is no longer available for other projects or when the researcher's published articles appear and requests are made for re-analysis of the data. Researchers rarely consider the role that an archive can play in processing and documenting the data files.3 Part of the problem lies in the nature of processing and documenting a data file and in the perceived importance of the role of archives.
In many cases, the researcher did not adequately prepare the documentation, and countless hours are spent reconstructing the study design, sources and coding schemes. Sometimes, the researcher will decide that the hours spent in this effort just do not warrant the file's dissemination, and the data then become unavailable to other researchers. A related problem also plays an important role in a researcher's decision to make the study available: no funds are available to document the study. Although the study was probably funded by a research committee or federal agency, no funds were allocated to process and document the data once the file was created. If funds were initially requested for processing the data, the agency most likely refused to fund that portion of the project. In general, however, neither the researcher nor the funding agency has accepted the norm that funds for processing the data be built into a research budget proposal.
Neither the researcher nor the federal funding agency has supported the view that federal funds should be used to provide for the pooling of limited resources and the archiving and maintenance of a data collection, nor has it meant that archives are funded to create data collections. Funding agencies continue to support research projects which require the use of similar data items. Each researcher must begin anew to acquire data. This is an enormous investment of a researcher's energy. Rather than investing his energies in developing the theoretical and methodo-
logical guidelines of his analysis, the researcher must devote the initial months of the project to gathering data. Not only are resources expended in the gathering of similar data, but individual researchers must then undertake to process and document the data. Given the idiosyncracies of each researcher and oftentimes the lack of knowledge necessary for processing and documenting the data, the items are processed differently and the documentation is not standardized. Archives which have gaini considerable experience in processing and documenting data files efficiently, and are probably in the best position to provide ad vice on creating data collections are almost always refused a r& quest for funds to create data files. Federal funding agencies perceive their "mission" as support for research and not data collection. This view of how social science research is conducted indicates that there cannot be a real understanding of th nature of empirical enquiry on the part of public officials. This is also a short-sighted view of an agency's mission, because as has been pointed out above, the agency will continue to lengthen the time involved in testing hypotheses and will expend precious resources unnecessarily. While it may not be of much importance that time and resources are expended unnecessari the view that no funds are allocated to an archive for the processing, documenting, and maintenance of a data collection has meant that the quality of data collection has suffered.
IV. RESTRICTIONS ON DISSEMINATION OF DATA
Because data are the responsibility of the data supplier, the data may not be released for general use. There is apprehension that the data will be used improperly or that analysis will lead to contradictions in published findings, or that the cleaning process may cause embarrassment. Too, the data supplier places restrictions on access to the collection of data because analysis of the data is still taking place and the data supplier should be given the first opportunity to publication based upon the analysis of the data. While a number of people may view restrictions on access to a data collection as a difficult obstacle to overcome, we, at DPLS, have had few problems, Restrictions on access are respected by the archive staff. Each collection which is disseminated by DPLS is accompanied by the data supplier's signed statement specifying the conditions under which the data may be disseminated. The statement forms part of the permanent record on the data file. Every time a user wishes to have a data file duplicated, its classification status (i.e., restricted or unrestricted) is verified. Restricted access file require written permission of the data supplier before the archive staff will process a user's request. This, of course, lengthens the time between locating a data set suitable for one' analysis and preparing it for analysis; however, if data archive are to fulfill their major functions of obtaining and preserving data for secondary analysis, their staffs must respect the restrictions placed upon the data deposited by a data supplier. I
should be added, however, that pressure needs to be applied to those researchers who refuse to make their data publicly available after two to three years upon completion of the project. Archivists generally believe that a researcher who refuses public access to a data collection does so because the data will not stand up to public enquiry and his results may be questioned if the data are re-analyzed.
V. EXPORTABILITY OF THE STORAGE MEDIUM
A major problem faced by DPLS is the exporting and importing of the data storage medium in a form easily usable in one's computing environment. Hollerith cards are not universal: cards are punched in EBDIC, ASCII, BCD, etc.,codes. They are a poor medium of storage, as they are bulky, easily affected by atmospheric conditions, and costly to transport. Further, as data files become more massive, it is impossible to manipulate card-images. As a result, we export and obtain our data on magnetic tape. Character codes, blocking, number of channels, density, and physical record size are of immense importance to us. In order for a data set to be usable immediately on our machine, a Univac 1110, we must obtain data in BCD or FIELDATA codes, seven channel, 556 or 800 BPI format. For use on the Univac 1110, blocking size is not of great importance. But, because we also have access to an IBM 1410 which inexpensively duplicates data, we are limited to a BCD, seven channel, even/odd-parity, 556 BPI format, with a block size of no more than 6000 characters. It becomes immediately evident that unless an outside user has access to a machine with a seven channel tape drive, data cannot be obtained inexpensively from us; nor can we easily obtain data from elsewhere if the data cannot be formatted to our specifications.5 (The following is an example of what DPLS faces every time it wishes to add a non-BCD data tape to its collection: A major recent acquisition to DPLS has been the statistics of the International Monetary fund, 1948 to 1973. This acquisition arrived in Burroughs BCL code, necessitating many hours of conversion before the data could be used on our UNIVAC 1108. Conversion is so costly that DPLS is unable to convert the data which arrive monthly, but must make the conversion program available to the user, who, in turn, must invest precious resources to obtain the necessary data. Yet, the acquisition is deemed of importance to scholars at the University, and will have wide-spread use within the academic community.)
One reason why exportability of the storage medium is a problem in the accessing of data is because there has been little agreement about the form in which data are to be transported. It has meant that the major data suppliers, like the Inter-University Consortium for Political Research (ICPR), must supply data in different forms in order to meet the needs of an archive's clientele. That is to say, ICPR's computer must be able to write tapes in BCD and EBCDIC, at different densities, for schools which have the SPSS or OSIRIS system or no file handling system at all, in physical records of different character sizes, and to provide
descriptions of the location of the variables according to the differing formats of the data. The incompatibility between and among the differing hardware configurations and different computer environments has meant, too, that data are not "moving" as easily as they should be among researchers based at different computer installations.
Probably the greatest difficulty in exporting the data is the poor documentation of the data description. By this we mean that we are often faced with not knowing on what machine the data were duplicated, the character code, whether the tape contains internal labels, and if those labels have end-of-file marks following them, whether record lengths are fixed or variable (size of the record and whether the record is rectangular or hierarchical), density, etc. Actually, this documentation is quite trivial in that the problem could be solved by standardization of the description of data tape specifications.
We need to develop an efficient data access system which will allow the scholar to pursue and renew research. If scholar can quickly locate pertinent data wherever they are stored, if the indexing of studies and items is efficient, and if the retrieval and analysis methods are rapid and readily adaptable to scholars' needs, large data files will provide the analyst with enough studies to confirm or reject a hypothesized set of relations among several variables. If enough replication is possibl the conditions under which the propositions are true can be specified. Rapid access will produce a breakthrough in the develop ment of axiomatic theories by presenting to social theorists lai quantities of data which can be analyzed according to their own specifications.
The preceeding discussion has emphasized problems of access to data. However, the same sorts of problems exist for computer programs. Because DPLS is a center for machine-readable data and computer programs of the social sciences, we are continually faced with providing our users with access to statistical routin for analysis of their data. We have found that we cannot resolv data access problems without first considering the development o related software and hardware and the interfacing of the data wi the program. The development of an efficient data access system requires substantial funds for the development of efficient soft ware. Because substantial funds are not available, we continue to meet immediate needs via the tedious manual search of the lit erature and data documentation, correspondence, and informal com munications networks among archivists and social scientists.
This discussion has painted a gloomy picture of data access today. We should conclude by offering several recommendations which could facilitate access to data.
We need a national association of social science data archives. This association would set as one objective the inven-
torying of available machine-readable social science data collections . It would publish this inventory yearly and make it available to its members, who would not be individuals representing only themselves, but archives responsible for preserving and maintaining data files for their institutions. The association would be a centralized facility for providing information about data files. S S Data: A Newsletter of Social Science Archival Acquisition could act as the reporting journal for such information. S S Data, whose staff now depends on the good will of the University of Iowa and a very meager budget based on a subscription fee of $5.00, would receive an enlarged budget through an increased subscription fee from contributing archives. The subscription fee could be based on the size of the local archive's budget and the size of the institution to which the archive belongs. An increased subscription fee would mean an enlarged S S Data staff which could then allocate a staff member to locate data files for the association's members. Although some members of the defunct Council of Social Science Data Archives recommended that information on the availability of data be transmitted via a telecommunications system, such a grandiose scheme is not necessary to the success of locating data. The telephone and mail systems operate well enough and there is enough good will generated by archive staffs, that we do not need to envision such a method of transmitting information. Given EDUCOM's ARPA network recommendations and the increasing number of universities and private and public agencies which are joining this network, we might forsee a future "hook-up" via teletype.
At present, each archive has very little information about the successes (and failures) of another archive. Few archives know how another operates or is administered. This association would give all archives an opportunity to seek assistance from others. Fledgling archives would have an opportunity to seek assistance from on-going archives. Through an association, structured workshops could be established to work on the problems of retrieval systems, integrating the various file handling systems which have proliferated throughout the country, and standardizing a classification system, abstracts and bibliographic records, data collection processing and documentation of data and software.
As was stated earlier, there has been no acceptance of the norm that funds for processing the data be built into a research budget proposal. There has existed the parochial view that the data belong to the researcher who collected them�even though the support funds have come from a university research council or the federal government. We must make a real effort to educate the people who are applying for research grants which involve the collection of data. We must make a real effort to educate the funding agencies who support the research. Researchers and funding agencies must realize that the data need to be more widely disseminated. An association of archives could, perhaps, make the difference in whether archives will be properly funded by universities and the federal government. An association could more effectively work
to initiate changes in the quality of data collections. An association which presented publicly the archive's "view of the world" so to speak, might be a more effective lobbying group foi federally supported data gathering projects than is the individti archive which requests funds. Finally, because so much data hav, been collected through federal research support, federal monies should be used to support not only the individual researcher who collects the data, but the archive which preserves, maintains. arj disseminates them.
This discussion of data access problems has been based on the experience of the Data and Program Library Service at the University of Wisconsin. Although no reference has been made tc problems facing Latin Americanists, there is no doubt that all archives, no matter where they are located, face precisely the s problems. The problems differ only by degree.
1. See John D. Byrum, Jr., and Judith S. Rowe, "An Integrated, User-Oriented System for the Documentation and Control of Machine-Readable Data Files", Library Resources and Technical Services, Volume 16, No. 3, Summer 1972, pp. 338-346.
2. This does not deny the existence of the growing number of data management systems such as SPSS (University of Chicago), OSIRIS (University of Michigan), IMPRESS (Dartmouth College), ADMINS (MIT), DATATEST (Harvard), BEAST (Brookings Institution) TROLL (MIT), MASSAGER (Canada), PSTAT (Pittsburgh), TDMS (SDC), and STATJOB (University of Wisconsin).
3. Researchers who do make their data available to an archive have almost always used an archive, and have thus been abl to view firsthand the problems of working with inadequately processed and documented data files. These researchers are largely the younger generation of social scientists.
4. One reason we have not encountered difficulties in obtaining data on Latin America is that we have depended on the American-based archives specializing in this area (e.g., Latin American Data Bank, University of Florida) to acquire data. However, it should be stated that data from Latin American countrie can be more difficult to obtain because the data are not permitt to be disseminated outside a nation. Some Latin American counti consider data files to be part of the national patrimony, and tt fore the data cannot be accessed by an individual who is not a citizen of that nation. �
5. I do not want to leave the impression that we do not provide or obtain data in other forms than easily utilized on oi machine. We do have access to machines with nine-channel tape drives. The cost, however, of these machines increases the cosl of duplication of data for a requestor from outside the University of Wisconsin system , and increases our costs when we obtain data from elsewhere.
LATIN AMERICAN HOLDINGS OF THE UNIVERSITY OF WISCONSIN DATA AND PROGRAM LIBRARY SERVICE *
The data holdings of the Data and Program Service at the University of Wisconsin are organized into two sections. The first section is the PRAGMATIC INDEX. Data sets are organized into twenty-one categories which are neither exhaustive nor mutually exclusive because the contents of most data sets are varied. Although many data sets logically fit into several categories, each data set has been assigned a unique number which serves to organize the library's documentation for the data set.
The UNIT-OF-ANALYSIS INDEX is designed to locate data for particular countries or areas. It is organized on the basis of two units of analysis (observations) concerning humans: aggregate and individual. The aggregate units of analysis are geographic (e.g., nations, states, congressional districts, counties, cities, metropolitan areas, census tracts) and non-geographic entities (e.g., hospitals, corporations). Within categories relevant data sets are listed alphabetically by country of origin, its subdivisions (when appropriate), a descriptive term for the principal type of data contained, the data when the data set was first generated, and the identification number assigned in the PRAGMATIC INDEX. Data sets marked with an asterisk (*) are held uniquely by the Data and Program Library Service.
The ACKNOWLEDGEMENTS note the donors of unique data sets. These data may be distributed to researchers outside the University of Wisconsin, Madison campus. Data sets not marked with an asterisk were obtained from other data archives (notably, the Roper Public Opinion Research Center at Williams College, the Inter-University Consortium for Political Research at the University of Michigan, the International Data Library and Reference Center at the University of California, and the Louis Harris Political Data Center at the University of North Carolina) . These data sets may be duplicated only for the University of Wisconsin, Madison campus users. Unless a data set is restricted, it is available for immediate distribution either Ln card or tape form.
For a complete listing of data holdings and further infor-nation the reader is referred to the Data and Program Library Service; Social Science Building-Room 4451; University of Wis-:onsin; Madison, Wisconsin 53706.
This partial list was compiled from the Wisconsin list of data iolding3 by the editor.
OUTLINE AND ORGANIZATION OF DATA HOLDINGS
I. PRAGMATIC INDEX
01A Survey Election Studies
01B Aggregate Election Studies
02 Attitudes Toward the Political System
05 Roll Calls
06 Black-White Relations
08 Community Studies
09 Attitudes Toward the Social System (Self and Society)
11 Occupational, Geographic and Social Mobility
14A Econometric Models
14B Economic Surveys
14C Economic Time Series
18 Surveys Conducted by Commercial Polling Agencies
II. UNITS-OF-ANALYSIS INDEX
A. Human Aggregates
2. Sub-national Units
3. Non-geographic entities
B. Human Individuals
1. Objective Events
2. Attitudes, Values, Beliefs-Surveys III. ACKNOWLEDGEMENTS
I. PRAGMATIC INDEX
01-ELECTION STUDIES (A. SURVEY; B. AGGREGATE)
01-025A POLITICAL BEHAVIOR IN CHILE, OCTOBER 1958
01- 008B *ARGENTINE ELECTION OF 1946 AND CENSUS OF 1947 (RAW
02- ATTITUDES TOWARD THE POLITICAL SYSTEM
02-012 LEADER AND VANGUARD IN ARGENTINE SOCIETY (KIRK-
PATRICK SURVEY), 1966
04-009 WORLD HANDBOOK OF POLITICAL AND SOCIAL INDICATORS,
04-010 AGGREGATE DATA FOR 20 LATIN AMERICAN COUNTRIES,
08- 006 DIFFUSION OF INNOVATIONS: BRAZIL, 1968, COMMUNITY
09- ATTITUDES TOWARD THE SOCIAL SYSTEM (SELF AND SOCIETY)
09-013 PATTERN OF HUMAN CONCERN: CUBA, 1960 (REFORMATTED
VERSION FROM ICPR)
09-014 PATTERN OF HUMAN CONCERN: BRAZIL, 1960-1961 (REFOR-
MATTED VERSION FROM ICPR)
09-015 PATTERN OF HUMAN CONCERN: PANAMA, 1962 (REFORMATTED
VERSION FROM ICPR)
09-018 PATTERN OF HUMAN CONCERN: DOMINICAN REPUBLIC, 1960
(REFORMATTED VERSION FROM ICPR)
09- 020 SURVEY OF LIFE STYLES AND ASPIRATIONS IN COSTA RICA
AND EL SALVADOR, 1960, 1963
10- 004 SURVEY OF FERTILITY PATTERNS AMONG MIDDLE CLASS WIVES
IN COSTA RICA, 1964
11- OCCUPATIONAL, GEOGRAPHIC AND SOCIAL MOBILITY
11-006 URBANIZATION IN SIX BRAZILIAN CITIES, 1960 (IDL-RS
*Data sets marieci^wStri^an asterisk (*) are held uniquely by DPLS.
CAREER VALUES IN BRAZIL, 1963 (IDL-RS 302-40-0002) CAREER VALUES IN MEXICO, 1963 (IDL-RS 110-40-0002) *SOCIAL STRATIFICATION IN FOUR RURAL COMMUNITIES, BRAZIL, 1953 AND 1962
STANDARD OF LIVING IN FIVE CENTRAL PROVINCES, CHILE, 1964 (IDL-RS 304-71-0001)
*MIGRATION AND ADAPTATION IN CENTRAL BRAZIL, 1966 STRATIFICATION AND MOBILITY IN FOUR LATIN AMERICAN CITIES: BUENOS AIRES, ARGENTINA, 1960 STRATIFICATION AND MOBILITY IN FOUR LATIN AMERICAN CITIES: RIO DE JANEIRO, BRAZIL, 1969 STRATIFICATION AND MOBILITY IN FOUR LATIN AMERICAN CITIES: SANTIAGO, CHILE, 1961
*ATTITUDES TOWARD LEARNING SPANISH AMONG QUECHUA AM AYMARA CHILDREN * ATTITUDES TOWARD LEARNING ENGLISH AMONG PUERTO RICA NINTH-GRADE SCHOOL CHILDREN
11-008 11-009 11-010
15-003 *SLAVE SHIP RECORDS, GREAT BRITAIN, 1817-1843
15-004 *SLAVE SHIPS OF EIGHTEENTH CENTURY FRANCE "...REC0E1
15-005 *SLAVE SHIPS TO RIO DE JANEIRO, RECORDS OF
18-SURVEYS CONDUCTED BY COMMERCIAL POLLING AGENCIES
M. URUGUAYAN INSTITUTE OF PUBLIC OPINION
20 Latin American Nations; variables of general
utility, 1950-1965 Argentina;slave ship arrivals at Rio de Janeiro,
Canada;national economic time series, 1967
France;slave ship activities, 1700-1800
Great Britain; slave ship activities, 1817-1843
15- 004 15-003
HUMAN AGGREGATES-Sub-national Units
Study Description Identification
Argentinajall provinces, election data, 1947, census
data, 1947 01-008B
Mexico;all municipios (counties), census data,
1950, 1960 07-016
HUMAN INDIVIDUALS-Attitudes, Values, Beliefs-Surveys
Argentina;sample of medical students, politics and
career expectations, 1961 17-005 Argentina;sample of Buenos Aires, stratification
and mobility, 1960 11-016 Argentina; elites and electorates, political
attitudes, 1966 02-012 Brazil; urban and rural households, migration,
Brazil; urban workers, career values, 1960 11-008 Brazil; national sample, aspirations and worries,
Brazil; sample from 6 cities, migration, 1959 11-006 Brazil; sample of households in 4 rural communities,
social stratification, 1953 and 1962 11-010 Brazil; sample of Rio de Janeiro, stratification
and mobility, 1959 11-016 Brazil; local officials and informal leaders in a
sample of villages, local changes, 1966-1968 08-006 Brazil; farmers and agricultural advisors,
innovations, 1966-1968 08-006 Chile; sample of Santiago, stratification and
mobility, 1961 11-017 Chile; sample of 5 central provinces, standard of
living, 1964 11-011
Chile; national sample, politics, 1958 01-025A Costa Rica; local sample of middle class wives,
fertility 10-004 Costa Rica; repeated urban sample, life styles,
1960 09-020 Cuba; national sample, aspirations and worries,
El Salvador; urban sample, life styles, 1963 09-020
Mexico; urban workers, career values, 1963 11-009 Mexico, U.S., Great Britain, Germany, Italy;
national sample, politics, 1959-1960 02-006 Panama; national sample, aspirations and worries,
1962 09-015 Peru; sample of Aymara and Quechua children
Spanish language training c. 1965 13-018
Uruguay; national sample, amalgam, 1966 18-001M
01-008B Collected and donated by Professor Peter Smith,
History Department, University of Wisconsin. ACCESS RESTRICTED
13-018, Conducted and donated by Professor Erwin Epstein, 13-019 and Sociology Department, Kearney State College. ACCESS 13-020 RESTRICTED
15-003 at)(j Collected and donated by Professor Philip Curtin, 15-004 History Department, University of Wisconsin 15-005 Collected and donated by Professor Herbert S. Klein, History Department, Columbia University, under arrangements made by Professor Philip Curtin, History Department, University of Wisconsin
THE ICPR AS A RESOURCE FOR SOCIAL RESEARCH: DATA DEVELOPMENTS RELEVANT TO LATIN AMERICAN STUDIES
Richard I. Hofferbert, Executive Director Inter-university Consortium for Political Research The University of Michigan
ARCHIVES OF SOCIO-POLITICAL DATA ARE NOW BEGINNING TO FULFILL the early projections of their potential as aids to social research. The major archives�the Roper Center for Public Opinion Research, the Inter-university Consortium for Political Research, the International Data Library and Reference Service, and the Latin American Data Bank�stand as illustrations of the brighter side of the tempestuous marriage between technology and scholarship. Several factors have converged in the last decade to fulfill the promise upon which these institutions were founded. Computing capacities and the intermediate personnel to make these resources available on most campuses have been of considerable consequence.
The raw technology, however, can be over-valued in the generally salutary developments which have taken place. More important have been changes in the skills and habits of scholars. Graduate curricular changes in the last decade have significantly expanded the pool of personnel skilled in quantitative analysis. And, most importantly, the ethic which prescribes the sharing of data resources among scholars has led to the compilation of a readily available body of resources exceeding in its substantive and qualitative content the brightest of the early predictions . While there may be no grounds for an outpouring of scholarly glee, methodological and theoretical advances are rapidly being made in comparative social research. These advances are coming in an iterative and interdependent fashion with the growth in skilled personnel and in information resources, as represented by the contents of the various archives.
The success of the archives as aids to comparative research is closely related to the growing number of indigenous scholars within several national settings. As long as an asymmetrical relationship exists between American scholars and their colleagues abroad, neither the quality nor the quantity of comparative information (as represented, in this case, by the content of data archives) can be expected to attain the level that previously existed in American studies alone. However, today in virtually all countries where American scholars have focused their attention *e find the growth of groups of local scholars trained in modern social inquiry, supported by local institutional capacities, and engaged in cumulative, substantively sensitive, and theoretically interesting research. Furthermore, following the example early
established in the American context, these scholars are increasingly willing to share the results of their work, both in print and as data, with scholars throughout the world. The existence of the archives throughout the 60's as service agencies which could aid in the multiplication of research results from the data collected by scholars in the field has changed both in degree and kind the potential for comparative inquiry. The examples of the North American archives are being followed in a variety of national contexts.
Although no multi-institutional, indigenous archival facility in Latin America has yet attained international cen-trality, there are numerous points of research activity within the Latin American context which promise to develop close relationships with existing archival institutions and also to grow steadily in terms of their own national and regional capacities. Many of the data sets produced by students of Latin American affairs are now being routinely made available to the international scholarly community through the existing archival facilities. Appended to this paper is a full list of data in ICPR that are most likely to be useful for students of Latin American socio-political phenomena.
I want to focus this paper generally on newer modes of research with high theoretical potential, resting upon the utilization of archival resources currently available. Because of nr> greater familiarity with the resources of the ICPR, I will concentrate upon those data. Specifically I want to highlight the possibilities for utilizing and integrating four types of data that until recent years were not readily available in compatible form. The first category of data is a growing body of surveys of political attitudes and behavior with comparable content froi a wide range of national settings and time points. The second set of materials are nation-level data describing rates of socis change, economic activity, political structures, and indicators of public policy. The third category, which I shall show may well be integrated into the previous two, consists of data on si national units from multiple national settings and time points. Finally, I will take brief notice of certain data which concern extra-national phenomena and which make possible analysis of a variety of multi-national activities.
"SECONDARY" VS. "EXTENDED"ANALYSIS OF SURVEY DATA �
In the realm of survey materials there is a growing number of studies in the ICPR archive which share a common theoretical focus and which contain a considerable comparability of content, Mass surveys of electoral behavior are now in the archives in quite usable condition from Australia, France, Ireland, Britain, Norway, Switzerland, Japan, Canada, Italy, and Germany. In cor� ing months, additional countries will be added including at least a few major Latin American surveys. I list these examples because of the comparability of content. As the appended notes make clear, there are a variety of surveys already available in the archives which concern Latin American phenomena.
The political studies, however, all build upon the general focus and thrust of political behavioral inquiry begun in the United States in the 1950's but significantly modified in recent years to accommodate specific national contexts and theoretical advances. Why, one might ask, should non-Latin American studies be cited for a group of Latin American scholars? Consistent with the logic that I hope to put forth in the following paragraphs, these studies should provide comparative base points for one pursuing research in additional national contexts. A set of surveys being conducted at the present time in Brazil are relying heavily upon the designs, assumptions, and content of the studies mentioned above.
One semantic problem should be set aside very early in this discussion. Customarily, the term "secondary analysis" is used to describe the utilization of data from archival sources. Given the increasing complexity of the tasks to which these data resources are put, I prefer a different term. Although Herbert Hyman might disagree, "secondary analysis" has a certain ring to it implying the warming over of used data. The image of the undergraduate honors student leafing through the cards from the American Voter to sub-divide the categories yet one more way comes first to mind. I have no readily acceptable substitute for "secondary analysis," unless it would be "extended analysis". My reason for suggesting a new term is because of the uses to which existing data resources are now being put and to which they are likely to be applied in the future.
The actual and forseeable theoretical advances rest upon the diversity and multi-level nature of the data resources which are increasingly available. In particular, the possibility for formulating and testing generalizations independent of particular political systems is growing rapidly. Existing data resources are utilized in several ways in these developments. First of all they are used as models for replication. Thus the early voting studies conducted in the United States have been modified, verified, replicated, and improved in other national contexts. In addition to serving as models for replication, existing data resources provide bases for expanding comparison. Thus scholars interested in phenomena in as yet unexamined national contexts not only borrow from the designs of existing research, as manifested in the data, but also themselves provide through their own efforts an ever-expanding base of comparative materials. Finally, the sheer passage of time and the repeated financial support for mass survey research has provided a basis for increasingly interesting longitudinal analyses. In the case of the American surveys, for which we have the largest body of materials, we now can do in-depth studies of across-time changes from 1952 to 1972. The French and British electoral surveys span the past 15 years. The theoretical possibilities are implicit in the number and time span of nationally based studies. Use of data resources as models for replication, as bases for expanding comparisons, and for longitudinal analysis is far richer than customarily implied by "secondary analysis." It illustrates an
extension of the theoretical and methodological potential of the resources that may well have gone unperceived by the scholai originally responsible for the first body of materials.
I have repeatedly made reference to theoretical possibilities implicit in the resources being acquired by various data archives. Primarily, the theoretical possibilities lie in the capacity to move from system and time specific correlation of attributes to the comparison of general relationships. For example, what groups, areas, or aggregations comprise the political "periphery" for a multitude of nations? The portions of the population that are in some sense or another peripheral| the main national dimensions of political and social interactioi vary in their specific content from one nation to another. The salient minorities in one place may be Catholics, in another a linguistic group, in another those pursuing a particular mode ol tillage, and still elsewhere it is language that sets groups to the edge of the socio-political "mainstream." We can, for example, describe the correlates of isolationist attributes in Norway. Perhaps Nun^rsk speakers appear as a unique group. Bui is it not much more theoretically promising to exploit multiple data sources that at least offer the chance to define both the independent variable�"periphery"�and the dependent variable�"isolationism"�in terms that are not nation-specific? Is there a functional equivalence between serious Catholics in France, Italian-speaking Swiss, American blacks, Mexican Ejidos, and, say, scheduled castes in India? Are the relationships not only system independent, but also longitudinally stable?
Clearly I am suggesting that the collection of multiple ti multiple system surveys encourages continuity and comparabilitj I am also suggesting, however, that they allow for more elegant developmental models. Theoretically the most productive questions concern not the correlation of attributes in a single set ting, but the macroscopic forces at the system level which stni ture the patterns of relationship between micro level phenomena
Correlations between social class and political behavior a no more fixed in the stars than is the political relevance of r ligion or region. That which is "politicized"�i.e., attribute which structure political behavior�is itself the. result of pro cesses determined in part by structural conditions peculiar to national contexts. Aggregate phenomena at the nation-level can be viewed either as consequences of internal political dynamics or alternatively as contexts which determines internal politica processes. The theoretical question need not be resolved in or der for the relevant information to be analyzed imaginatively. The macro data makes possible at least the realistic design of research enterprises aimed at unraveling some of the causal sequences .
From the early, admittedly halting steps to develop widely available nation-level indicators (e.g^ the Cross-Polity Surve one now has available quite sophisticated and much more reliabl
indicators in machine-readable form. Arthur Bank's Cross-Polity Time Series represents a major improvement in both the theoretical potential and the accuracy of the contents over the Cross-Polity Survey. Data from 1815 to 1966 for 153 national units are included in the more recently compiled set. Add to this the domestic conflict data from Bank's collections or from Ted Robert Gurr and others, and one is able to begin conceiving of designs which have a rich longitudinal as well as multi-level component. Rummel's Dimensionality of Nations data plus now two editions of the World Handbook of Political and Social Indicators all add significantly to the storehouse of easily utilizable materials and should challenge the scholarly imagination to design ever more elegant analytical structures. Whereas one previously was limited either to individual level phenomena or to aggregate indicators, the argument over the relative superiority of the one over the other clearly must take a back seat to the challenge to design research undertakings which base their expectations upon the interaction between contextual and individual phenomena.
An example might well be the question of political involvement by peripheral populations. What are the general macro phenomena that politicize groups on the edge of the national "mainstream"? When and where do class phenomena emerge as more important than ethnicity or language?
No single project (with very few exceptions) has been able to compile the resources for such extended analysis. Yet the generosity of scholars and the institutional capacities of archives have moved us around that problem.
Attention needs to be given not only to the multiple applications of survey resources and utilization of archival facilities for individual level analysis in multiple settings at multiple timepoints; we now need to look also at the possibilities of multi-level analysis. Survey work heretofore has, as is appropriate to the instrument, been microscopic and, perhaps unfortunately, often designed with minimal attention to contextual phenomena such as community or other environmental attributes.
Survey research necessarily suffers from the basic sampling problem. One cannot readily compare jurisdictions within nations on the basis of mass, national sample surveys. The N must necessarily be far too large for the practical possibilities of current research resources. The cumulation of studies in the various archives is generally overcoming some of this limitation.
Whereas in the past, furthermore, when one resorted to aggregate data it was usually because of the impracticality of obtaining adequate survey data, one now confronts the possibility and necessity of considering the context within which individual behavior takes place. Interesting designs are suggested by the possibility of examining multiple contexts. The nation, in many contemporary settings, is perhaps the least salient context within tfhich individual behavior takes place. Perhaps we can never attain a comprehensive, global examination of family life, but we can certainly move, with existing resources, to a comparison of the salience of socioeconomic context at various levels from the
community to the nation (and even conceivably the international system).
MULTIPLE TIME AND MULTIPLE SETTING SUB-NATIONAL DATA
Depsite the difficulties of doing sub-national survey analyses in a manner that will aggregate to rhe national level we have longbeen alerted to the consequentiality of community attributes, structural constraints, and local leadership phenomena in the development of patterns of political behavior. In the American context this early on took the form of a dispute between users of aggregate data and users of the survey instrument. That it is not a question of which type of data are better should have been clear from the outset. Tingsten's studies of voting behavior in the 1930's revealed what has come to be called the "concentration effect." As the aggregate proportion of working class persons in a community rose, up to a point, there was an increase in the number of working class persons voting for labor parties. The curvilinear relationship would never have been revealed in a mass national survey. The practice of carefully coding sub-national location of respondents in jurisdictions for which aggregate socioeconomic material are available is growing in popularity and theoretical promise. In their recent book on Participation in America, Verba and Nie carefully examine the effects of community attributes on individual participation. Technically, it is no great problem. One simply enters into the respondents' files the characteristics of the unit within which he resides. Such a procedure will increasingly allow us to examine survey materials with the aid of companion aggregate materials , such as those on some of the countries that are now represented in the ICPR files. In fact, there are currently some surveys which have been specifically designed to incorporate aggregate or "contextual" attributes directly into the files. In one survey currently being analyzed in Switzerland, attributes of the canton have been entered into the individual files.
Fortunately, we need not duplicate for every study the task of collecting all of the aggregate materials. There is within the ICPR and other archives a growing body of aggregate socioeconomic and political data on significant sub-national units within a variety of national contexts. Currently, thanks to the cooperation of the Berkeley archive, ICPR has the Di Telia data on Argentine counties and provinces from the mid-60's. These data, described in the Appendix are a rich source material concerning population composition, literacy, and economic activity within a large number of Argentine units. Similar data are contained in the Schmitter collection on Brazilian states for a time period encompassing 1940-1960. Data which I have collected on Mexican states provide yet another national setting within which socio-political data on sub-national units are available. In the latter case, comparable indicators have been collected from Mexican states, Canadian provinces, French departments, Swiss cantons, and American states.
One should not confine their interest in aggregate data, ho�
ever, exclusively to the manner in which they can complement survey materials. Sub-national aggregate analyses clearly have a role to be played in their own right. This is especially becoming clear in the area of comparative policy studies. In many instances, the sub-national units for which data are readily and currently available have considerable responsibility for raising revenue and determining the patterns of public policy within nations. Exciting analyses comparing not merely correlation of attributes within a single country, but the patterns of internal dynamics across sets of nations await us in this domain. Just as the discussion above pointed to the possibility for cross-national comparisons of micro socio-political phenomena, similarly aggregate sub-national data can be analyzed (especially in a policy model) to reveal the dynamics of internal socio-political processes. Is the relationship between urbanization and public policy (as revealed by sub-national correlation analysis) comparable from one national setting to another? Have population growth, industrialization, or other attributes of social change had similar political consequences from one cultural context to another? By examining not merely bivariate correlations but also patterns of association from several of settings, one can move toward a reasonably promising attack upon such questions.
The promise is in the ability to move from system and time-specific correlation of attributes to the comparison of general relationships across systems. Clues provided by students of international and extra-national phenomena suggest that such general patterns of relationship may, in certain domains, be the result of events, developments, and conditions identifiable only in a global context.
EXTRA-NATIONAL AND INTERNATIONAL ANALYSES
Within my own discipline of political science there is probably no sub-field changing so rapidly as international relations. One consequence has been a certain erosion of identity within this area of inquiry. To the extent that fields of scholarship identify themselves by their methods and tools of analysis, international relations is now many fields. Quantitatively-oriented scholars who designate themselves as concerned with extra-national (e.g. international organizations) or international matters focus on such diverse things as coalition formation within the United Nations, the multi-national corporation, domestic and international violence, the attributes of collective events, and economic integration within and between nation states. There is an increasingly large body of students of global phenomena who eschew the nation state as their most useful unit of analysis.
One of the ICPR's biggest gambles was the launching, three and one-half years ago, of the International Relations Archive. At that time the ICPR sought to acquire a large number of data sets of diverse content from scholars in the field who represented the most innovative thrusts in international studies. The list is too long to reiterate in the body of this paper, but the types of data are of considerable interest. Many of these studies
provide the means for obtaining insight into the role of Latin American nations in the world community. They enable one to examine the question "How regionally particular are the members of the Latin American community in the world at large?" Whether one is dealing with event data such as collected by Charles McClelland or the Feierabend's or whether one is examining inter-nation conflict or coalition formation within international organizations, the opportunities for such research in a relatively inexpensive manner have been significantly expanded by the generosity of scholars in the community and the institutional developments within ICPR. In fact, the utilization of data from the International Relations Archive of the ICPR has expanded more rapidly than any other component of ICPR activities.
The initial' gamble which led to the creation of the IR Archive was based on two assumptions: First, it was assumed that scholars in the field would indeed be willing to accept the ethic of sharing data; secondly, it was assumed that there was an as yet unidentified market for readily and inexpensively a-vailable quantitative data on international affairs. Both of these assumptions proved valid. As a consequence, the rate of utilization of the International Relations Archive has increased by better than 200% annually for the last three years.
Number of card images of data distributed is a weak reed upon which to base optimistic renditions of the glories of archi\ and the impact they have had. What do people do with these data once they acquire them? Why has not the full range of theoretical and design variation such as I have discussed in the preceding pages been visible in the published literature of the day! There are some interesting dynamics to the process of data archiving and utilization which help illuminate these questions. Some comprehension of usage, timing, and information diffusion will also be of benefit to students of Latin American politics who are anticipating investment of time, energy, and institutional resources into the archiving of machine-readable data.
USE AND PUBLICATION WITH ARCHIVAL RESOURCES
The ICPR currently has 170 member institutions. The modal user of ICPR data is still a political scientist. However, the financial support for and the utilization of services from ICPR is increasingly inter-disciplinary. Expansion of aggregate membership as well as diversification among the community of users helps to explain the average 80% annual increase in volume of data distributed by the three archives within ICPR over the past 5 years. Accurate records of what in fact is done with these data are very hard to come by. Periodically we circulate the representatives in order to get an estimate of the teaching applications, student papers, dissertations, and publications whicl are taking place at their institutions. The response rate is never anywhere near 100%. And even those responding eften are unable to give us a full accounting of what usage is made of the resources on their own campuses. We have, therefore, resorted t( searching journals in order at least to find published journal
articles utilizing ICPR-supplied data.
Anyone working closely with data archives appreciates the problem of accounting for usage. All of the major archives request their users to acknowledge the archives' assistance in publications using data so acquired. Most request copies of published material. Of the articles discovered in our journal search this year, ICPR was cited in about 90% of the cases. However, the two copies of publications�clearly requested in all ICPR codebooks�had been received for 7% of tha publications. The problem for the archives is not so much being unloved as it is being able to demonstrate to those who pay the bills that the investment is sound.
Part of the accounting problem was revealed to us this year when we conducted a modest study of turn-around time between archival receipt of data to actual publication in major journals. Customarily, data reveived by ICPR are announced to the Official Representatives on each member campus within three months. However, most scholars who utilize the data usually discover a study in the annual publication of our Guide to Resources. One year is a conservative estimate for minimal diffusion of information a-bout the availability of a data set. From the time a user received data until publication is usually three years. Half of that period is spent in analysis and writing. The other half is expended in reviewing and publication.
Most of the several dozen articles published last year with data received through ICPR used data lat left Ann Arbor in 1967-68. In 1967-68, ICPR distributed 12,000,000 card images of data. In 1971-72, over 40,000,000 card images were distributed. We will see most of the publications on those data during 1975-76. By 1971 not only the magnitude, but also the substantive content and potential for complex analysis was vastly expanded over that represented in current publications using the data sent in 1967-68.
Similar patterns are likely in the experience of other archives. We cannot be certain what analyses are in the pipeline at present, but we can be confident that the modes and magnitude of research exploiting these resources justify the investment of time and talent that led to the creation and sustains the growth of the various social science data archives.
Most of the observations and speculations in this paper apply to many avenues of social inquiry. The multi-dimensional matrix of social information to which I have alluded has many empty cells. But some of those of immediate interest to students of Latin A-merican social and political life have significant entries. I would urge anyone launching research on Latin America to examine with care the current contents of the archives. The major archives are eager to apply their varied capacities to building resources for Latin American studies. As the realization of their current capacities�both operative and potential�becomes more well-known
and as the commitment to sharing data resources becomes even more widely accepted among students of Latin American phenomena, the multiplicative effect seen in other domains of inquiry will become commonplace in Latin American studies as well.
DATA SETS WITH LATIN AMERICAN CONTENT IN THE ICPR
The data sets listed below are available from the Inter-University Consortium for Political Research. Modes of access, servicing policies, and other services of the ICPR are described in the Guide to Resources and Services of the ICPR, (ICPR, P.O. Box 1248, Ann Arbor, Michigan 48106). Studies indicated by an asterisk (*) were supplied to the ICPR by the International Data Library and Reference Service (Berkeley) and may be obtained from there or from ICPR.
I. Individual (Micro) Data
A. Gabriel Almond and Sidney Verba, Five Nation Study. (ICPR 7201) A cross-national survey of five western nations�the United Kingdom, Germany, Italy, Mexico, and the United States. The United Kingdom has 963 respondents; Germany, 955; Italy, 995; Mexico, 1008, weighted to 1295; United States, 970. Each country has 4 cards of data per respondent, 166 variables. Interviewing took place during June and July, 1959, in all countries except the United States, where it took place in March, 1960. The interviews were largely structured, ranging in length from about forty minutes to somewhat over an hour. About 10 percent of the questions were open-ended.
The study concentrated on tapping respondents' basic political attitudes with questions that emphasized political partisanship, political socialization, and attitudes toward specific institutions, as well as the political system and culture as a whole. Specific variables included in the study were respondents' political awareness and feelings of political efficacy, feelings toward bureaucracy, police, political parties, campaigning, different levels of government, and such institutions as the school, family, and place of work. The number and types of organizations to which respondents belonged were recorded, as well as information bearing on the respondents' own upbringing and educational experiences. For a discussion of the data see Gabriel Almond and Sidney Verba, The Civic Culture: Political Attitudes and Democracy in Five Nations (Princeton: Princeton University Press, 1963).
B. Kurt W. Back, Reuben Hill, and J. Mayone Stycos, The Family and Population Control-A Puerto Rican Experiment in Social Change. 888 respondents, 2 cards of data per respondent. Data for the study were collected in 1953-
1954 in both urban and rural areas of Puerto Rico. The sample includes four categories of respondents: those who had never used birth control devices, active users of these devices, those who had stopped using methods of birth control, and those who had been sterilized. Of the total sample, 566 interviews were conducted with wives only, 322 with husband and wife together. The data contains blanks.
The study explores the relationship between husband and wife in questions about family organization and role, degree of intimacy, sexual relations, and satisfaction with the present marriage. Further variables probe attitudes toward children: ideal family size, importance of children in marriage, and parent-child relations. The study also examines respondent's attitude toward birth control, knowledge of where to obtain birth control advice and materials, use of birtl control materials, and birth control methods the respondent uses. Derived measures include several Gutt-man scales. See Kurt Back, Reuben Hill, and J. Mayors Stycos, The Family and Population Control-A Puerto Rican Experiment in Social Change (Chapel Hill: Univer sity of North Carolina Press, 1959); J. Mayone Stycos, Family and Fertility in Puerto Rico (New York: Columbii University Press, 1955).*
Hadley Cantril, The Pattern of Human Concerns. (ICPR 7258) Of the fourteen nations reported in Professor Cantrill's book, The Pattern of Human Concerns (New Brunswick, N.J.: Rutgers University Press, 1965), the Consortium has data for the following ten: Brazil, 1142 respondents weighted to 2740, 4 cards of data per respondent, approximately 73 variables, collected late 1960 and early 1961.
Cuba, 992 respondents, weighted to 1490, 6 cards of data per respondent, approximately 40 variables, collected April-May, 1960. The sample represents only urban areas.
Dominican Republic, 814 respondents weighted to 2443, 4 cards of data per respondent, approximately 63 variables, collected April, 1962.
India I, 2306 respondents weighted to 5720, 5 cards of data per respondent, approximately 45 variables, collected in spring, 1962. The sample under-represents females.
India II, 2014 respondents weighted to 4994, 6 cards o data per respondent, approximately 45 variables, collected in January, 1963 (after border fighting with China). The sample under-represents females. Israel I, 1170 respondents, 5 cards of data per respon dent, approximately 39 variables, collected November,
1961 through June, 1962.
Israel II, Kibbutzim, 300 respondents, 5 cards of data per respondent, approximately 39 variables, collected July through October, 1962.
Nigeria, 1200 respondents weighted to 2876, 1 card of data per respondent, approximately 66 variables, collected September through November, 1962. The second sample has 2841 respondents, 1 card of data per respondent, and was collected in spring, 1963. Panama, 642 respondents weighted to 1351, 4 cards of data per respondent, approximately 82 variables, collected early in 1962.
United States, 1549 respondents weighted to 2696, 4 cards of data per respondent, approximately 101 variables, collected August, 1959.
West Germany, 480 respondents, 1 card of data per respondent, collected September, 1957. Yugoslavia, 1524 respondents, 1 card of data per respondent, collectec in Spring, 1962.
Besides ascertaining the usual personal and demographic information, Cantril tried through his "Self-Anchoring Striving Scale," an open-ended scale asking the respondent to define his hopes and fears for himself and his nation, to discover the two extremes of a self-defined spectrum on each of several variables. After getting these subjective ratings from respondents, Cantril had each respondent indicate his perception of where he and his nation stood on a hypothetical ladder at three different points in time. For information on samples, coding and the means of measurement see The Pattern of Human Concerns. All data were collected by native interviewers .
Centro de Analisis Social Latinoamericano, Caracas, Venezuela, Attitudes of Students at La Salle. 170 respondents, 1 card of data per respondent. The study was conducted in 1964 in Caracas, Venezuela at La Salle, a Catholic boys' school. The data contain non-numeric codes.
Respondent's pride in his school, the type of education he receives, and a definition of the La Salle spirit are ascertained. The study probes the respondent's attitude toward sexual morality, sexual relations before marriage, and responsibility in cases of adultery. Religious knowledge is also explored in questions about the holy mass and the gospels, as ia the respondent's familiarity with such diverse concepts as communism, liberalism, and Christianity.*
Corporacion de la Reforma Agraria, Santiago, Chile, Agrarian Reform in Chile, 1422 respondents. The study conducted in Chile in 1963, contains three samples. The first sample was drawn from urban zones of the thn main/cities in Chile-Santiago, Concepcion, and Val-paraiso/Vilfla del Mar-with 998 respondents and 1 card of data per respondent. The second sample of 324 respondents, 2 cards of data per respondent, was drawn from agricultural workers in the north, central, and southern agricultural zones of Chile. The third sample was drawn from agricultural zones throughout Chile and the sampling respondents were recipients of land through agrarian reform. The sample contains 100 respondents with 2 cards of data per respondent. The data in all three samples contain blanks.
Sample one questions urban residents about their knowledge and attitude toward Chilean agrarian reform, the importance of mining, agriculture, and industry development, positive and negative effects of agrarian reform, and, in addition, probes knowledge of the literacy campaign in Chile and its participants. Demographic data are also explored. The respondents of the second sample are asked the purposes of agrarian reform, qualifications necessary to receive land, attitudes toward Chilean agrarian reform, and the positive and negative effects of agrarian reform. Agrarian worker attitudes toward agricultural cooperatives are also examined. Recipients of land through agrarian reform, sample three respondents, are queried a-bout the advantages they see in receiving land through agrarian reform. The study also ascertains their knos ledge of the purposes of reform, their attitudes towai agrarian reform, and its positive and negative effects The respondent's opinions about formal schooling for children in the area as well as teaching home crafts t women and instructing local men in working the land ai also explored.*
A. Fiensot, University Students' Images of the United States. 879 respondents, 7 cards of data per respondent. The survey was administered to students at nini universities in Brazil in 1963. The data contain non-numeric codes.
The study establishes the respondent's university, major field of study, residence, self-perceived social class, and race. Further variables probe the respondent's knowledge of the world outside of Brazil, especially the United States. The major portion of the study measures the respondent's concept of similaritii and dissimilarities between the United States and Bra-
zil in such varied areas as attitudes toward the family as a primary group, social class structure, importance of the labor movement, the quality of education, meaning of nationalism, opportunities for the Negro, moral standards of the people, political participation, and private vs. public initiative within the economy. Respondents are also asked to estimate the answers which they feel a U.S. citizen would give to these same questions about Brazil. The study proves sources of the respondent's information about the United States including newspapers, magazines, U.S. movies, radio, television, and personal relations with people from the United States.*
Gino Germani, Stratification and Mobility in Four Latin American Cities. The study was conducted in Buenos Aires, Argentina in 1960. Sample A, whose respondents are household members, contains 5764 respondents with one card of data per respondent. Sample B respondents are heads of household; there are 2077 respondents with four cards of data per respondent. The data contain non-numeric codes.
Sample A (household members) ascertains demographic information, details of the respondent's employment, and foreign backgrounds of those who have immigrated to Argentina. The respondent's native language, his familiarity with it, and his feelings of affection toward his native country are explored.
Sample B (family heads) questions respondents about country of birth and arrival in Argentina. The study further explores the respondent^ leisure activities, his outlook on life and attitudes toward people, and his familiarity with his native language and country. A major portion of this sample traces the respondent's occupational patterns, beginning at age 21 and continuing through his present occupation. Father's and grandfather's occupations are also examined. Derived measures give the respondent's own occupational mobility as well as occupational change from one generation of his family to the next.*
Eduardo Hamuy, Political Behavior. 807 respondents, 6 cards of data per respondent. The study was conducted in Santiago, Chile in 1958. The first wave interviewed prior to the presidential election, contains all 807 respondents, while the second wave was interviewed after the election and is composed only of the 339 respondents who voted in the election. The data contain non-numeric codes.
In the pre-election wave, the study examines political questions such as political party affiliation and poli-cies associated with each party, presidential candidates and their platforms, the respondent's intention of voting and for whom, influential factors in the respondent's vote, and his past voting patterns. The respondent is also asked to pinpoint Chile's most pressing problems and to comment on the position of the Chilean peasants and workers. The post-election wave of the study probes the respondent's estimation of an administration under either of the two most popular candidates, his explanation of the election results, and his vote and the reasons for it. Extent of news media influence is also explored. Derived measures include socio-economic and occupation indices.*
Eduardo Hamuy, Stratification and Mobility in Four Latin American Cities. 822 respondents, 3 cards of data per respondent. This survey, conducted in 1961, was administered to 822 residents of Santiago, Chile. The data contain non-numeric codes.
The study thoroughly investigates past and present occupations of the respondent in order to ascertain socio economic status within the society and to discover patterns of social and economic mobility. Variables include the respondent's satisfaction with his job and his feeling of permanence in it, the kind of work done, whether the respondent is self-employed or employed by a public or private institution, and the status which his occupation holds-from proprietor to unskilled laborer. These variables establish present occupation, and they are repeated for the respondent's occupation at age 21, 28, 35, and 45. The past is further explon through variables concerning the respondent's father and paternal grandfather and their occupations. After examining respondent's role within the society, the study explores his awareness and understanding of the world around him, at the local, national, and international levels. These variables range from participation in local clubs, to opinions about the current government in Chile, to questions about Fidel Castro and the Cuban Revolution.*
Eugene Havens and Aaron Lipman, Personality Disorganization among Refugees of the Violencia. 135 responden 1 card of data per respondent. The study was conducts in Bogota, Colombia, in 1962. Data contain amps and dashes.
The respondent's background is explored in variables asking birthplace, number of years in Bogota, and oc-
cupation in his own land. The major portion of the study explores the respondent's attitudes towards political violence: its effects on people's confidence in the government, the position of the church, groups who have benefited from it, reasons for it, and the extent to which the violence has spread into areas of life other than politics. Derived measures include national identification and security scales and a scale showing attitudes toward the church.*
K. Institute for International Social Research, Attitudes of Cubans. 1490 respondents, 3 cards of data per respondent. The study was conducted in 1960 in Cuba. Data contain non-numeric codes.
The study contains primarily open-ended, multiple response variables. The present is explored in terms of the best, as well as the worst, aspects of the respondent's current situation; his degree of satisfaction with his present life; and, on a larger scale, positive and negative aspects of life in present day Cuba. The respondent is also asked to look ahead and describe life for himself and his country, at their best and worst imaginable ten years hence. Demographic data include the respondent's age, sex, race, marital status, education, and socio-economic group.*
L. Instituto de Sociologia, Universidad Nacional del Li-toral, Rosario, Argentina, Political Apathy in Rosario. 560 respondents, 1 card of data per respondent. The study, conducted in 1963 in Rosario, Argentina, was administered to two samples, one of 282 respondents and the other of 278. The data contain amps and dashes.
Demographic variables establish sex, age, education, occupation, and income level of the respondents. Past and present levels of political interest are ascertained through questions which establish voting patterns, explore sources of the principles on which political parties stand. Further variables investigate interactions of the government with the people, such as the right of the state to intervene in economic activity and the equal representation of all social classes in the government.*
M. Joseph Kahl, Career Values in Brazil. 627 respondents, 3 cards of data per respondent. The study was conducted in 1960 in the Brazilian cities of Rio de Janeiro, Minas Gerais, and Rio de Sul. The data contain blanks.
The study thoroughly describes the respondent's current occupation, length of employment, what he likes most and
least about his job, and his income. Variables further explore past occupations, the highest level of education attained, and the extent to which lack of education has handicapped the respondent's career. A major portion of the study probes the respondent's feelings about the nature of jobs and people; the importance of ambition and determination in one's job, individual vs, group interests, how best to "get ahead", importance of family ties, tendency to trust others, and corruption in the urban centers. A number of recodes and derived measures are included in the study.*
N. Joseph Kahl, Career Values in Mexico, 740 respondents, 4 cards of data per respondent. The study was conducted in 1963 in both urban and rural areas of Mexico. The data contain blank codes.*
0. Henry A. Landsberger, Personnel Managers. 91 respondents, 4 cards of data per respondent. The study was conducted in 1963 in thpee cities in Chile-Santiago, Concepcion, and Valparaiso. The data contain non-numeric codes.
The study explores in detail respondent's position as a personnel manager. Variables ascertain the industry or service in which he works, the number of people he manages, a full definition of his present position and the responsibilities which it entails, and his past experience in personnel work. The respondent's position within his company is examined, as is the interaction which he has with various other positions and departments within the company. A major part of the study investigates company relations with the workers' union: Variables probe personnel's interaction with the unions degree of union participation in certain aspects of company management, help which the union gives its members, and influence which the union has on its members, In addition to specific questions about workers in the respondent's company, the study asks the respondent to rate the average Chilean worker on a variety of scales,
P. Latin American Center for Research in the Social Sciences, Rio de Janeiro, Brazil, Stratification and Mobility in Four Latin American Cities. 3628 respondents, 1 card of data per respondent. Data were collected from 1959-1962 in Guanabara, Brazil and currently contain non-numeric codes. The sample consists of two distinct groups: respondents from the urban area generally and respondents from the slums.
The respondent's occupation is examined in variables which describe current job, amount of supervised rathei
than independent work, permanent or transitory nature of job, income, and second occupation if applicable. Further variables ascertain the respondent's interest and involvement in his surroundings: membership in clubs and organizations, political party affiliation, and newspapers respondent reads. A major portion of the study explores the composition of the respondent's family and the kind of home in which he lives.*
Q. S.M. Lipset, University Students-Values, Vocations and Political Orientations.
The following are several studies conducted in 1964 or 1966. The general study concentrates on a description by university students of their problems and ideas: educational, economic, social and political.
The respondent's educational background is explored through questions about types of schools attended, subjects excelled in, evaluation of self as a student, and current field of specialization. The value which the respondent places on his education and on the university in general is examined in variables which probe the importance of completion of his studies, chosen occupation after leaving the university, and his evaluation of life as a student. Another major portion of the study reveals the national and international arenas as the respondent sees them from the university perspective. Respondents are asked their opinions a-bout specific issues and the role of the government in the economic life of the country. Further variables concentrate on areas of international scope. Finally, the studies examine the respondent's outlook on life through questions which tap his views of morality, his tendency toward progressive political thinking, and his views of the world he will enter as he leaves the university.*
Brazil, 1322 respondents, 2 cards of data per respondent. Conducted in 1964, the stu^y questions a sample of university engineering students in Brazil to give a picture of social, economic, political, and psychological aspects of university life. The data contain non-numeric codes.
Colombia, 1594 respondents, 3 cards of data per respondent. The study was conducted in 1964. Mexico, 830 respondents, 3 cards of data per respondent. The study was conducted in 1964. The data contains non-numeric codes.
Panama, 1034 respondents, 3 cards of data per respondent. The study was conducted in 1964. The data contain non-numeric codes.
Paraguay, 482 respondents, 3 cards of data per respondent. The study was conducted in 1966. The data contain blanks.
Puerto Rico, 577 respondents, 3 cards of data per respondent. The study was conducted in 1964. The data contain non-numeric codes.
Uruguay, 475 respondents, 3 cards of data per respondent. The study was conducted in 1966. The data contain blanks.
R. Marplan, Santiago, Chile, Students at State Universit; of Santiago, University of Concepcion, and University of Temuco. 1542 respondents, 5 cards of data per respondent. The study was conducted in 1964 in the Chile cities of Santiago, Concepcion, and Temuco. The data contain non-numeric codes.
The study examines the respondent's attitudes toward such national issues as important problems which Chile must face, major obstacles to more rapid development, collaboration with the United States to promote the economic development of Chile, the position of U.S. companies in Chile, and the influence of political, military, religious, and professional groups. The re spondent's opinions of communism, capitalism, and socialism are probed in variables asking which system could be best for Chile and why. International affaii are also examined. The respondent is asked about the Cuban Revolution, the effectiveness of the Organizatii of American States and his opinion of the Alliance foi Progress, especially as it affects Chile. Exposure tt the mass media, including foreign radio broadcasts, ii explored as is the respondent's opinion of various foi eign governments' publication.*
S. Jose Higuens, Voting Attitudes. 1383 respondents, 1 Ci of data per respondent. Data for the study conducted 1963 were gathered from four electoral districts of Ai gentina: Buenos, Aires, Partidos Suburbanos, Mar del Plata, and Bahia Blanca. Sampling criteria were the voting population. The data contain blanks.*
The study examines the importance of the forthcoming general elections, the effect possible candidates wou have on the country in general, economic corruption, a economic progress and changes. Further variables exp the possibility of the respondent voting for radical candidates and his attitude toward voting policies. Demographic data include education and occupation of the respondent, his employment status, sex, and age group.*
T. Jose Miguens, Voting Opinions. 364 respondents, 1 card of data per respondent. The study was conducted in 1965 in the zones of Velez Sarsfeld and Avellaneda, Argentina. The data contain blanks.
The study ascertains demographic data of age, education, occupation, and socio-economic status of the respondent. Political party affiliation and interest in voting in the coming congressional elections are also explored. Further variables trace the respondent's voting history during the past five years and the people with whom the respondent discusses his vote.*
U. Eduardo Munoz, Attitudes and Opinions Towards Education and Work. 1010 respondents, 3-4 cards of data per respondent. Conducted in Chile in 1964, the study is composed of four samples. Sample size ranges from 118 to 333 respondents. Sample A contains youngsters 12-14 years of age, Sample B is made up of respondents 16-22 years of age, Sample C respondents are adults, and Sample D is composed of primary and secondary teachers. The data contain non-numeric codes.
The study probes Chilean attitudes towards jobs and education by sampling children and young adults, whose futures are not yet decided, and working adults, teachers in particular, whose occupations and educations are settled. Variables determine the respondent's highest educational and occupational aspirations in comparison with realistic appraisals of what he is now doing and will be doing in the future. The study further explores important factors in deciding upon an occupation, as well as advantages which an educated person enjoys. Respondents in all groups but Sample A are asked to judge certain goals of the Chilean education system, their desirability, and the degree to which they are fulfilled. These respondents are also asked to rate, on a high to low scale, the social prestige of given occupations in Chile. A major portion of the study examines the appeal which various aspects of jobs have for respondents: personal contact with supervisors, fellow-workers, and clientele, fixed vs. relaxed time schedules, supervision, initiative required, responsibility assumed, intensity of work, physical effort necessary, and variety of duties performed.*
V. David Nasatir, Role of the University in Development of Political Consensus. 1660 respondents, 4 cards of data per respondent. Data for the study were collected in 1963 in five cities in Argentina: Buenos Aires, Mendoza, Cordoba, Rosario, and Resistencia. The sample contains two distinct groups: students at Argentine universities
and non-students chosen by quota sampling based on dimensions of age, sex, and social class. The data contain non-numeric codes.
Principal variables examine influences in the respondent's choice of a major field of study, study habits, participation in university government, social contacts within the university, and the prestige of a universitj education. In addition, the study explores political attitudes and behavior, perception of stratification, occupational status, and attitudes toward group interests. Demographic data include the respondent's age, sex, marital status, religion, and residence during childhood, as well as father's education, occupation and income. See "Education and Social Change, Thx Argentine Case", Sociology of Education, Vol. 39 (Spring, 1966), No. 2; "University Experience and Political Unrest of Students in Buenos Aires", Comparative Education Review, Vol. 10 (June, 1966), No. 2.*
W. Ronald Scheman, Law School Students, 1251 respondents, 1 card of information per respondent. Data for the study were collected in 1960 at fifteen law schools in several states in Brazil. Sampling criteria were law school students, primarily in the first, third or fifti| year of study: the law schools were selected to get a geographic and economic cross-section of the country. All students present in class on a specified date were given the questionnaire. The data contain non-numeric codes.
The study ascertains such family background data as ags and education of the respondent's siblings, national origins of parents and grandparents, education of parents, occupation of father, and social class of the re-> spondent's family. The respondent's past is further explored through questions about his motivation for choosing law as a field, subjects other than law which he has studied, and average grades obtained. In addition, the study probes occupational intentions, frequency of travel abroad, voting participation, desirability of student political activity, and the respondent's involvement in student politics. See articles by Ronald Scheman in the following publications: "The Social and Economic Origin of the Brazilian Judges", Inter-American Law Review, Vol. IV (January, 1962), No. 1; "Brazil's Career Judiciary", Journal of the American Judiciary Society, Vol. XLVI (December, 1962) No. 7; "The Brazilian Law Student-Background, Habits, Attitudes", Journal of Inter-American Studies, Vol. 5 (July, 1963).*
S. Schwartzman and Mora Y. Araujo, Prestige of Latin Nations According to Students. 362 respondents, 3 cards of data per respondent. Interviewing of students for the study done in 1965, took place at several universities, only one of which is not in Latin America. The data contain blanks.
Demographic variables establish nationality, age, sex, the field and years of university study. Respondents were then asked a series of questions to be answered for each of the twenty Latin American countries covered by the study. Variables which asked the students to name the capital and head of state for each country as well as to approximate the population of each reveal extent of basic information about these Latin American nations, while estimates of per capita income, illiteracy, industrialization, and race in each country probe for more in-dapth knowledge. Respondents were also asked to rate each country's prestige and importance within the Latin American system and to consider what criteria are relevant for defining the position of a country: its size, average education, industrialization, political stability, degree of urbanization, and scientific development.*
Glaucio Soares, Voting Attitudes in Rio. 1884 respondents, 1 card of data per respondent. The study was conducted in August of 1960, in Rio de Janeiro, Brazil. The sample, taken from the voting population of the city and stratified by electoral zones, was contacted two to four weeks before the presidential election. The data contains blanks.
The study first establishes the amount of political information which the respondent receives through the news media. Further variables ascertain his interest in the coming election, his past voting decisions, and his party preference. The respondent's perception of social class rating and his ideas about the distribution of wealth and improvement of living conditions are also explored. Demographic data include the respondent's occupation, age, marital status, race, sex, and socio-economic status.*
United States Information Agency, World Survey II, Attitudes Toward Domestic and Foreign Affairs. 466 respondents, 2 cards of data per respondent. Data for the study were collected in February and March of 1964 in Rio de Janeiro, Brazil. The sample consists of persons 18 years of age and older. The data contain blanks.
Demographic data ascertained by the study include the
respondent's occupation, marital status, sex, age, and education. In its investigation of Brazil's domestic issues, the study explores respondent attitudes in suck areas as standard of living, population problems and birth control, attitudes toward political parties and their leaders, Brazil's stand in the conflict between the communist and anti-communist ideologies, and the economic influence of the United States and the Soviet Union on Brazil. Variables concerned with issues and affairs at the international level examine the respondent's comparisons of the achievements and foreign policy of the United States and the Soviet Union, as well as his opinions about the nuclear testban and disarmament, attitudes toward Fidel Castro and his impact on life in Cuba, the position of the United Nations, and the treatment of Negroes in France, U.S., Russia, and South Africa.*
Nation Level Aggregate Data (Including Latin American Nations)
A. Arthur S. Banks and Robert B. Textor, Cross-Polity Survey. Data for 115 polities. Each polity has nominal and ordinal data on 59 "raw characteristics" and 194 "finished characteristics." The raw characteristics include economic-demographic indicators and more subjective measures such as degree of political modernization and interest articulation. The finished characteristics are dichotomous variables which contrast groups of polities in various ways. Each polity is classified on one side or another of each dichotomy. The data were originally published in Arthur S. Banks and Robert B. Textor, A Cross-Polity Survey, Cambridge, Massachusetts: M.I.T. Press,1963. Also see, Phillip Gregg and Arthur Banks, "Dimensions of Political Systems: A Factor Analysis of A Cross-Polity Survey," and Arthur Banks and Phillip Gregg, "Grouping Political Systems: Q Factor Analysis of A Cross-Polity Survey," both in John Gillespie and Betty Nes-vold (eds.), Macro-Quantitative Analysis, Beverly Hills, California: Sage Publications, 1971.
B. Arthur S. Banks, Cross-Polity Time Series: 1815-1966. Time series data for 153 independent nations. There are 102 variables aggregated by year. The data are primarily interval level. There are 33 variables coded for the period 1815-1966: the remaining variables are coded for more limited time periods. Demographic, socio economic, and political attribute data are included. Data are published in Arthur Banks, Cross-Polity Time Series, Cambridge, Massachusetts: M.I.T. Press, 1971.
C. Arthur S. Banks, Domestic Conflict Behavior: 1919-1966.
Domestic conflict data for 111 countries. Data were collected for the years 1919-1939 and 1946-1966 on eight domestic conflict variables: riots, demonstrations, purges, government crises, strikes, coups, revolutions, and guerrilla war. Data exist for 42 years on 52 countries, and there are data for less than 42 years on 59 countries. Data may be obtained in either of two formats: nations as cases or nation/years as cases. In the first format a case wsuld be Canada and variables would be riots-1919, riots-1920, riots-1921, etc. In the second format, Canada-1919 is a case, riots a variable and Canada-1920 a second case.
D. Richard Cady, Franz Mogdis, and Karen Tidwell, Major Power Interactions with Less Developed Countries: 1959-1965. Data for 89 less developed countries on 56 variables recorded for 1959, 1961, 1963, and 1965. Variables include such measures as imports and exports, diplomatic representation, visits, communication rates, and proportions of imports from and exports to the major powers. These data were supplied by the Social Science Department, Bendix Corporation. The data set contains selected interactions of the United States, the Soviet Union, the People's Republic of China, and Eastern European Countries with the less developed countries.
E. Ivo K. Feierabend, Rosalind L. Feierabend, and Betty Nesvold, Yearly Measurement of Permissivene.ss-Coercive-ness of Regime. This data set contains detailed information on political structures relevant to the general concept of permissiveness-coerciveness of regime. The information gathered covers the following aspects of political regimes: associational group strength and freedom as indicated by the freedom accorded the trade union movement, church organizations, etc.; type of executive; nature and strength of party opposition; type of elections; party strength in elections; strength of parties in legislature; extent of press censorship, both internal and against representatives of foreign press; degree of independence of judiciary; structure
of local government; type of suffrage; extent of civil rights; extent of social reforms (land reform, nationalization of industry, etc.); changes of regime (dates, duration, regime rating); and extent and occurrence of oppressive measures (dismissal, imprisonment, execution, relocation, etc.).
The data set includes 84 countries for 22 years (1945-1966), and contains some 17,000 data cards. Data are drawn from encyclopedic sources for specific areas of inquiry, such as press censorship and trade union freedom.
F. John Gillespie and Dina Zinnes, Military Defense Expenditure Data: 1948-1970. Data for 123 nations for military defense expenditures. The data are either in U.S. dollars or national currency with an exchange rate provided. Sources are U.N. Statistical Yearbook; U.S. ACDA publications, and the U.N. Statistical Bulletin for Latin America.
G. Ted Gurr, A Causal Model of Civil Strife: 1961-1965. Data on 114 polities for the period 1961-1965. Variables are of three general types: magnitudes of conspiracy, internal war, turmoil, and total strife; measures of deprivation, and measures of mediating variables. Measures of deprivation include economic political and long and short term deprivation. Mediating var iables include ligitimacy, coercive potential, institutionalization, past strife levels and facilitation of strife. For a more complete discussion, see Ted Gun "A Causal Model of Civil Strife: A Comparative Analysi Using New Indices," American Political Science Review, LXII, 4, (December, 1968), and in Macro-Quantitative Analysis, John Gillespie and Betty Nesvold (eds.), Beverly Hills, California: Sage Publications, 1971.
H. Ted Gurr, Genesis of Civil Violence Project: 1961-196] Data for 114 nations on 60 variables for years 1961, 1962, and 1963. This study consists of aggregate data on indicators of civil violence and its predictors. Nations have also been categorized into four clusters, originally based on factor analysis, including political, socio-cultural, technological development, and si: of population and production center clusters. See Ted Gurr, "Conditions of Civil Violence: First Tests of a Causal Model," in John Gillespie and Betty Nesvold (eds.), Macro-Quantitative Analysis, Beverly Hills, California: Sage Publications, 1971.
I. Michael Haas, International Subsystems: Sybsystem Member Characteristics. Data on the members of each of twenty-one international subsystems-1649-1963. The uni is the nation in each subsystem, in all, 457 cases (a-bout 150 different national entities). There are some twenty-eight variables for each subsystem member. See Michael Haas, "International Subsystems: Stability and Polarity," The American Political Science Review, LXIV 1 (March, 1970), 98-123.
J. Rudolph J. Rummel, Dimensionality of Nations. Data are for 82 nations on 332 variables generally for 1955. So cial, demographic, cultural, economic, geographical an political national attributes, and international invol ment indices are included.
K. Rudolph J. Rununel and Raymond Tanter, Dimensions of Conflict Behavior Within and Between Nations, 1955-1960. Data for 86 countries on 22 variables. This data collection is the product of two separate studies conducted by Rudolph J. Rummel and Raymond Tanter u-tilizing identical variables for the time periods 1955-1957 and 1958-1960, respectively. The variables are domestic conflict behavior, such as riots and coups; and foreign conflict behavior such as protests and threats. Data originally used in Rudolph J. Rummel, "Dimensions of Conflict Behavior Within and Between Nations," and Raymond Tanter, "Dimensions of Conflict Behavior Within and Between Nations, 1958-1960," both in John Gillespie and Betty Nesvold (eds.), Macro-Quantitative Analysis, Beverly Hills, California: Sage Publications, 1971.
L. J. David Singer and Melvin Small, Diplomatic Exchange Data 1815-1970. The diplomatic exchange data are of two types: the asymmetric file and the symmetric file. In the asymmetric file for each international system member, a code is given to indicate from which other international system members the first received diplomatic missions. In the symmetric file the assumption is made that every nation receiving a mission to another nation also sends one to that nation.
M. J. David Singer and Melvin Small, Diplomatic Missions Received by Each International System Member: 1817-r-1970. For each international system member, the number of missions received, the particular nations sending missions and the rank of mission from each nation are coded. Data are recorded at approximately five year intervals beginning in 1817.
N. J. David Singer and Melvin Small, The Wages of War: Nation Data. There are two nation level data sets a-vailable. The first is Total National War Experience. This set contains data on 35 variables related to the cumulative war experience during the years 1815-1965. The second data set is Nation in Each War. This set contains data on some 29 variables related to the experience of a nation in a particular war. There are 239 nation/war cases. See J. David Singer and Melvin Small, The Wages of War, 1816-1965: A Statistical Handbook, John Wiley and Sons, 1972.
0. Charles L. Taylor and Michael C. Hudson, World Handbook of Political and Social Indicators, II: Nation Data. There are three nation-level data sets available.
A. National Aggregate Data. This section consists of
data for 136 polities on some 300 variables. Included are indicators of population size and growth, communications, education, culture, economic, and political variables for the four base years: 1950, 1955, 1960, and 1965. Data for 1965 are about 90% complete but the proportion of missing data is much higher for the three earlier years. Extensive documentation is provided by the investigators on sources and data quality, This documentation is printed in the codebook but may be obtained in computer readable form either merged in. to the substantive data file or as a separate file. Ths are about 365 note variables.
B. Annual Events. This section consists of data on 18 political events aggregated by year to the nation level for the years 1948-1967. The events included are: riots deaths from political violence, political assassinatio: armed attacks, elections, protest demonstrations, regime support demonstrations, political strikes, renewal of power, unsuccessful executive transfers, executive adjustments, regular executive transfers, executions, acts of negative sanctions, acts of relaxation of political restrictions, and external interventions. Sources are The New York Times and AP.
C. Raw Data. This section contains two sets of raw data: one has data used in Section I for constructing measures of fractionalization and concentration; and the other has data used for constructing measures of inequality. The fractionalization and concentration data are recorded for each city, political party, etc, for these variables: city populations, ethnic groups, language groups, export commodities, export receiving countries, distribution of votes by political party and distribution of seats in the lower legislative house. There are over 7,000 records in this set. The inequalit data are recorded as distributions of farms, acreage, labor forces and gross domestic product.
Data were collected by the World Data Analysis Program at Yale University. See Charles L. Taylor and Michael! Hudson, World Handbook of Political and Social Indicators, Second Edition, New Haven: Yale University Pre: (forthcoming, 1972).
United States Agency for International Development, Economic and Social Indicators for Latin America, I960-1971. These data are reported in Summary Economic and Social Indicators, 18 Latin American Countries: 1960-1971 prepared by the Office of Development Programs, Bureau for Latin America, Agency for International Development, and were compiled to meet the current data needs of that office.
The countries covered in the data are Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Paraguay, Peru, Uruguay, and Venezuela. There are 78 substantive variables for each of 12 years (the % change variables have data for only 1961-1971). This yields 918 actual variables in the data set. Variables include indicators for items such as GNP, gross investment, population, Government finances, agriculture, education, and health.
Q. United States Arms Control and Disarmament Agency,
World Military Expenditures-1970. Data for 120 nations. The data are military expenditures and related data such as GNP, public education expenditures, public health expenditures and population. Data for military expenditures, armed forces and gross national produce are reported at yearly intervals for the period 1964-1968 with a summary percentage of change figure for this period. Other data are available for 1968 only. Data are reported in U.S. Arms Control and Disarmament Agency's publication, World Military Expenditures, 1970, U.S. Government Printing Office.
R. Merged data from World Handbook of Political and Social Indicators and A Cross-Polity Survey. Data for 141 polities, 7 cards of data per polity. The data from these two collections were merged on country codes. As the polities included in A Cross-Polity Survey are a subset of those included in the Handbook, missing data codes have been assigned to countries with no Cross-Polity information.
S. Bruce M. Russett, Karl Deutsch, Hayward Alker and Harold Lasswell, World Handbook of Political and Social Indicators. Data for 141 polities on 70 variables. Data are interval level social, political, economic, demographic indicators and are generally for 1961-1963. Data were originally published in Bruce M. Russett, et al., World Handbook of Political and Social Indicators, New Haven: Yale University Press, 1964.
Sub-National Aggregate Data
A. Comparative Socio-economic, Public Policy and Political Data, 1900-1960. (Approximately 15,000 card-images.) Selected variables pertinent to the period from 1900 to 1960 for each of the four nations listed below. Each data set presents comparable data at the province or district level for each decade in the period. Variables included in most data sets deal with basic economic, social, political and public policy characteristics of
the units of analysis. Various derived measures, such as percentages, ratios and indices, constitute the bulk of these data sets.
The data were collected and prepared by the staff of the Comparative Political Behavior Project of the Cornell University Center for International Studies, under the direction of Richard I. Hofferbert.
Canada: Data for all provinces, 1900-1960, including information on elections (since 1920), occupations, migration and expenditures in several areas of governmental services.
France: Data for all departments for all legislative elections since 1936, the two presidential elections of 1965 and 1969, and several referenda held in the period since 1958. Socio-economic data are provided for the years 1946, 1954 and 1962 while various policy data are presented for the period from 1959 to 1962. Mexico: Data for all states at decennial points from 1910 to 1960. Socio-economic data are available for the entire period, while political and policy data are presented for the decades beginning with 1930. Switzerland: Data for all cantons for each decennial year from 1900 to 1960. Data on revenue and taxation are presented along with socio-economic variables and political information such as referenda returns and party votes cast in National Council elections.
B. Torcuato S. Di Telia, The Social Structure of Argentina Census Data on Economic Development. 538 cases (515 counties and 23 provinces in Argentina), 2 cards of data per case. Data were collected in 1965 in Argentina they contain amps and dashes. Sources of data were cens and other documents.
Principal variables in the study cover the active population and its occupational segments, size of commerce, extent of industry, size of rural development, production per capita, density of population, illiteracy, fan ily size, and agricultural production. Derived measures include indices of rural occupational stability, of dependency within the urban middle class, and of rural landowners. See Torcuato S. Di Telia, La Teoria del Primer Impacto del Crecimiento Econ6mico, Rosario, Universidad del Litoral, 1965.
C. Philippe Schmitter, An Aggregate Data Bank and Indices of Brazil. 22 cases, 10 cards of data per case. The dat bank covers three time periods-1940, 1950, and 1960-and 22 states of Brazil. Data for the aggregate data bank were obtained from several Brazilian census publication
including Anuario Estatistico de Brasil and Recensea-mento Geral de Brasil.
For each of the three time periods data given total population, rural employment, and industrial and commercial employment. Literate population, eligible e-lectorate, and actual voting electorate are also available in the data. The data ascertain numbers of industrial and commercial establishments as well as membership in various unions, in art and literary associations, in sports organizations, and in Roman Catholic religious organizations.
Extra-National and International Data (Exclusive of several in Category II which could be cross-classified here as well) A. Events Data
1. Walter H. Corson, East-West Project: Event Data, 1945-1965. Data for approximately 15,000 events. Each case is the report of a conflictive or a cooperative action (these include both verbal statements and non-verbal actions) within and between the nations comprising the NATO and Warsaw treaty alliances, Yugoslavia, and the Chinese Peoples' Republic. The events recorded cover East-West relations from 1945 to 1965. Each event is rated on conflict and cooperation intensity ratio scales which were established from questionnaires given to experts in international relations. Each event
is also coded for actors and targets involved, date, geographic area, action category, source. A short textual description of each event is included. NOTE: These data are temporarily restricted. The principal investigator requests those desiring the data write for release to him at 4107 North 35th Street, Arlington, Virginia 22207.
2. Ivo Feierabend, Rosalind Feierabend, and F.M. Jag-ger, Data Bank of Assassinations: 1948-1967. Data on 409 assassination attempts in 84 countries, perpetrated between 1948 and 1967, gathered from the
New York Times Index. Data include plotted, attempted, or actual murders of prominent public figures such as top governmental office-holders and military figures, leaders of large trade unions or religious movements, or leaders of minority groups. With each event, information is coded on the country, date and location of occurrence, the actual (verbalized) name of the assassin, when available, and of the target, the issue, type of group to which the assassin belonged, and the political position of the target.
Ivo Feierabend, Rosalind Feierabend, and Rose Kelly Data Bank of Minority Group Conflict. This data collection is a preliminary compilation of events denoting conflict between minority groups (ethnic, racial, linguistic, religious) and the predominant group within the society. Data are collected from Deadline Data on World Affairs covering the ten-year period, 1955-1965, but include only those 43 countries which have minority groups. A separate set of ratings for each country has also been made regarding the type of policies and objectives pursued both by the government toward minority groups (e.g., integration, autonomy, segregation, etc.) and by the minority groups themselves. Data include the initiator group, target group and third party to the event, name of the minority group (or groups) involved in the event, the nature of the issue underlying the dispute, for both amicable ami hostile events.
Furthermore, some 59 different types of events are distinguished, and each is coded for characteristics such as country, date, location, duration, number of persons involved, number of persons injured, number of persons killed, number of persons arrested, scaling of intensity of event on both hostility and amity scales devised by the investigators, outcome, number of significant persons involved, etc.
Ivo Feierabend, Rosalind Feierabend, and J.S. Chambers, Transactional Data Bank of Inter-Nation Conflict and Amity Events. Data on some 7,000 events. Sixteen types of hostile transactional events are included in the coding format, ranging from protest; accusations and recall of officials to quasi-military actions, troop mobilizations, and war. Fourteen types of amity events are also covered, rangini from offers to negotiate and confer to exchanges, agreements and alliances. Events are qualified in nineteen categories, including date, actor, duration and persons involved. The direction of the event and its retaliatory or nonretaliatory character are also included. These data are of a preliminary nature.
Charles McClelland, World Event/Interaction Survey (WEIS). Data for 23,000 events. Each case in the data set is a report of an international event. An event/interaction refers to words and deeds communi' cated between nations, such as threats of military force between nations. The IRA has WEIS data from
January, 1966, through August, 1969. Coded for each event is the actor, target, date, action code, arena and source of each item. Also included is a descriptive deck which is a complete textual description for each event. The IRA can supply two FORTRAN IV programs which can aggregate the daily data into other groupings, e.g., frequencies of action by one nation toward another on a month-by-month basis can be calculated.
6. Rudolph J. Rummel, Foreign Conflict Behavior. Approximately 13,000 events such as border clashes and threats on over 30 descriptive variables for 82 nations. The source of the data as well as measures of its reliability have also been coded. The periods of time covered include 1955, 1962-1965, and the first four months of 1966.
7. Charles L. Taylor and Michael C. Hudson, World Hand-book of Political and Social Indicators II; Daily Event Data. This data set contains 57,268 records
of data for 17 political events: riots, deaths from political violence, political assassinations, armed attacks, elections, protest demonstrations, regime support demonstrations, political strikes, renewals of power, unsuccessful executive transfers, unsuccessful irregular transfers, irregular power transfers, executive adjustments, regular executive transfers, executions, acts of negative sanctions, and acts of relaxation of political restrictions. The data are recorded at daily intervals for each event group for each country during the twenty-year period 1948-1967. For example, two riots in a country on the same day appear as one record or case; but one riot and one election in a country on the same day appear as two separate records. Seven sources were used including the New York Times Index and AP.
B. Conflict Studies
1. Lincoln Bloomfield and Robert Beattie, CASCON Project: Local Conflict Data. Data on 52 local conflicts since 1945. There are some 500 "factors"coded for each case. Factors are conditions or situations which might influence the course of a local conflict toward or away from increased violence. Each factor is coded as either no information, not present, present but no influence, much influence toward violence, some influence toward violence, little influence toward violence, much influence away from violence, some influence away from violence, or little influence away from violence. Factors are grouped into categories; previous relations between
sides, great power involvement, external relations military strategic, international, organizations, ethnic-minorities, economic, internal political, characteristic of one side, communication, actions or controls in disputed area. See Lincoln Bloom-field and Robert Beattie, "Computers and Policy Making: The CASCON Experiment," Journal of Conflict Resolution, Volume XIV, Number 4 (March, 197.
2. Richard Cady and William Prince, Political Conflicts: 1944-1966. Data for 323 conflicts. These data were supplied by the Social Science Division, Bendix Corporation. The data set contains information on political conflicts which occurred during the period 1944 to 1966. For each conflict, the variables include measures of duration, the type of military operations, the type of conflict, the method of termination, and the outcome with regard to the United States.
3. Michael Haas, International Subsystems: War Data. Data derived from four major studies of war: Lewis F. Richardson, Statistics of Deadly Quarrels (31 variables): Quincy Wright, A Study of War (15 variables); Pitrim Sorokin, Social and Cultural Dynamics (20 variables); and J. David Singer and Melvin Small, Wages of War (19 variables). Data are for 1649-1963 for 21 international subsystems. Each war in each subsystem from each study is the unit of analysis. Variables include length, type, outcome, participants and intensity. See Michael Haas, "International Subsystems: Stability and Polarity," The American Political Science Review, LXIV, Number 1 (March, 1970), 98-123.
4. J. David Singer and Melvin Small, The Wages of War: War Data. The war file contains data for 93 interstate, imperial, and colonial wars dating from 181f through 1965. Wars which did not involve at least one interstate system member or which were civil oi internal were eliminated. There are 30 variables it eluding the beginning and ending dates of the war, location, nation-mpnths, and battle deaths. For further information see J. David Singer and Melvin Small, The Wages of War, 1816-1965: A Statistical Handbook, John Wiley and Sons, 1972.
5. Charles L. Taylor, Michael C. Hudson and John D. Sullivan, World Handbook of Political and Social Indicators, II: Intervention Data. This data set contains data for interventions recorded at daily intervals during the twenty-year period 1948-1967.
The daily report is the unit of analysis. There are 1,073 records, one for each day on which an intervention occurred in a country. The number of records per country varies. If, for example, a country had no action meeting the criteria for inclusion as an intervention, no record is given for that day for that country. Those countries not involved in an intervention were excluded. Data are recorded for 89 of the 136 nations in the World Handbook Aggregate Data file and two international organizations. Some of the 31 variables included are the number of interveners, type of group involved, air and naval incursions and length of intervener's presence in the country. The data sources are the New York Times Index, Associated Press, Asian Recorder, African Research Bulletin, Middle East Journal, and African Diary.
C. Data on Dyads
1. William D. Coplin and J. Martin Rochester, Dyadic Disputes. Data for two basic units of analysis:
71 nations and 121 cases. This study provides data to compare and analyze the Permanent Court of International Justice, International Court of Justice, League of Nations, and United Nations in the international bargaining process. Data are included for all disputes: (l)which occurred between 1920 and 1968; (2)which were dyadic, i.e., in which only two states were directly involved; and (3)which were considered in at least one of the four institutions. Nation-unit data are divided into national attributes of participants and patterns of institutionalized usage by participants. Case-unit data include case attirbutes and attributes of the dyadic relationship between the two participants in each case. For further information see William Coplin and J. Martin Rochester, "The Permanent Court of Justice, the International Court of Justice, the League of Nations and the United Nations: A Comparative Empirical Survey," The American Political Science Review, Vol. LXVI, No. 2, (June, 1972), 529-550.
2. John Gillespie and Dina Zinnes, World Trade Data: 1958-1968. This data set contains export and import trade data collected on a country by country, directional basis. The source is International Monetary Fund Series of annual volumes�Direction of Trade. All data are reported in U.S. $.
3. Amelia Leiss, Arms Transfers. Data on arms transfers to 52 less developed countries. The transfer is the unit of analysis. Variables include the
donor, the recipient, the type of weapons system, the quantity transferred, certain characteristics of the system and the site of transfer. A second file contains detailed coded information about eacl weapons system.
4. Lewis Fry Richardson, Statistics of Deadly Quarrels; 1809-1949. Data for 779 dyadic quarrels from some 300 conflicts. These data, supplied by Rudolph Rummel, cover the time period from 1809 to 1949. A dyadic quarrel is a situation involving a pair of opponents and resulting in more than 315 human deaths. The magnitude of a quarrel is measun by the logarithm of the number of deaths. The rang; of magnitude in the study is from 2.50 to 7.50, the latter figure for nations involved in World War II, Each quarrel is identified by its beginning date and magnitude. For each quarrel, the nominal variables include the type of quarrel, as well as political, cultural, and economic similarities and dissimilarities between the pair of combatants. The data were originally published in Lewis Fry Richardson, The Statistics of Deadly Quarrels, Chicago: Quadrangle, 1960.
5. J. David Singer and Melvin Small, The Wages of War, 1816, 1965: Pairs File. The Pairs file contains dat on 1,312 pairs of nations involved in wars. There are 41 variables including type of war, duration and characteristics of each side. See J. David Singer and Melvin Small, The Wages of War, 1816-1965: A Statistical Handbook, John Wiley and Sons, 1972.
D. Data on International Organizations
1. Chadwick E. Alger, United Nations Interaction. Date on interactions between U.N. Delegates. The data were generated by direct observation of the meetin; of the Administrative and Budgetary (Fifth) Committee of the General Assembly during the Seventeenth Regular Session (1962). The unit of analysis is the U.N. member-country as represented by its delegates. Seventy-one variables have been coded for each country-delegate and his interactions.
2. Ivo Feierabend, Rosalind Feierabend and Betty Nesvold, Political Events Project: 1948-1965. Data on 8,000 events for 84 countries. This study is concerned with the amount of conflict directed by groups and individuals in the prevailing political system against other groups or persons. The data
cover the interval 1948-1965. Twenty-eight categories are used to classify the events. The study provides a conflict intensity rating for each e-vent. The data sources were the Encyclopedia Bri-^ tannica Yearbook and Deadline Data on World Af-fairs. An additional data set is being prepared from the New York Times for the period 1955-1964. Data were originally used in Ivo Feierabend and Rosalind Feierabend, "Aggressive Behaviors Within Polities, 1948-1962," in John Gillespie and Betty Nesvold (eds.), Macro-Quantitative Analysis, Beverly Hills, California, Sage Publications, 1971.
Harold K. Jacobson, The United Nations and Colonialism. Data on 1,166 U.N. roll calls. The unit of analysis is the U.N. roll call, specifically, any roll call fron 1946 to 1967 concerned with the issue of colonialism. The data contain information about each roll call and the voting record of U.N. member-countries on each roll call. Data sources were the United Nations General Assembly Official Records. Data are in vote format. That is, the roll call vote is the case; the vote of each member and other descriptive information are the -.ariables.
Michael Wallace and J. David Singer, Intergovernmental Organization Data: 1816-1964. Data for 237 intergovernmental organizations extant between 1815 and 1967. The membership status of 148 countries is recorded for each organization at five-year intervals. A nation is coded as being a full member of the particular IGO during the given time period, as associate member, a member of the international system but not a member of the IGO, or not a member of the system. See Michael Wallace and J. David Singer, "Intergovernmental Organization in the Global System, 1815-1964: A Quantitative Description," International Organization. Volume XXIV, Number 2 (1970), 239-287, and J. David Singer and Michael Wallace, "Intergovernmental Organization and the Preservation of Peace, 1816-1964: Some Bivariate Relationships," International Organization, Volume XXIV, Number 3 (1970), 520-547.
Charles Wrigley and ICPR, United Nations Roll Call Data. General Assembly roll calls for the First to the Twenty-fifth Plenary Sessions (1946-1970), the first to the Fifth Special Sessions, and for the seven committees. Portions of the roll call collections were archived from two different sources. The First to the Seventeenth Sessions, the First to the
Fourth Special Sessions and the First to Fourth Emergency Special Sessions were received from Charles Wrigley of Michigan State University. The data from all subsequent Plenary Sessions and for the committees were coded and processed by the international Relations Archive. All of the data are stored in member format. That is, the U.N. member is the case, and the roll call is the variable, and the member's vote is the value for each variable. The codebook contains a synopsis of each roll call including the total vote on that roll call and its location in the General Assembly Official Records.
Data on Alliances
1. Bruce Russett, International Military Alliance Data: 1920-1957. Data on 44 variables for 137 al-liances signed between 1920 and 1957. Categories of variables include background of the alliance, terms of the alliances, the type of alliance, characteristics of the member, and outcomes of the alliance. For further information see Bruce Russett, "An Empirical Typology of International Military Alliances," Midwest Journal of Political Science, XV, 2 (May, 1971), 262-289.
2. J. David Singer and Melvin Small, Annual Alliance Membership Data: 1815-1965. Yearly records of which nations are in alliance with which other nations and the type of alliance; i.e., defense, neutralit; or entente. For further information, see J. David Singer and Melvin Small, "Formal Alliances 1815-1939," Journal of Peace Research, No. 1, 1966; "Formal Alliances, 1815-1965," Journal of Peace Research, No. 3, 1969; and "Alliance Commitments 1815-1945," Peace Research Society Papers V, Phila-delphia Conference, 1966.
Data on International Systems
1. Michael Haas, International Subsystems: Subsystem Data. Data for 21 international subsystems, 1649-1963. There are some 25 variables including number of wars, polarity, alliances, number of members and resources. See Michael Haas, "International Subsystems: Stability and Polarity," The American Political Science Review, LXIV, 1, (March, 98-123.
2. Bruce M. Russett, International Regions and the International System. The study contains data on regions of social and cultural homogeneity, re-
gions with similar political attitudes on external behavior (measured by U.N. voting), regions of political interdependence (bound by international organization), regions of economic interdependence (measured by intra-regional trade as a proportion of the nations' national income), and regions of geographical proximity. For further information see Bruce Russett, International Regions and the International System: A study in Political Ecology, Chicago: Rand McNally and Company, 1967.
PROBLEMS OF DATA ACQUISITION IN LATIN AMERICA: THE ROPER PUBLIC OPINION RESEARCH CENTER
Philip K. Hastings Director, Roper Center Williams College
H. Jon Rosenbaum The City College, C.U.N.Y. Latin American Representative of the Roper Center
ONE OF THE MOST OBVIOUS OBSERVATIONS THAT CAN BE MADE ABOUT DATA banks is that their quality is largely determined by their holdings. One would expect subscribers to data bank services, therefore, to be particularly concerned with policies relating to data acquisition and usage potential. But paradoxically, users often seem more interested in such secondary problems as service charges, hardware, and the preparation and cleaning of collected data.
This paradox is difficult to understand. Certainly there is no dearth of significant and controversial issues associated with data collection, whether it be conducted in the United States or elsewhere. For example, many Latin Americans .and other Third World citizens are vitally concerned with activities of American data bank representatives and individual researchers in their midst. Charges of cultural imperialism have become all too common.
The first section of this chapter is devoted to a consideration of data bank acquisition policies, particularly as they relate to Latin America. While the Roper Center has been actively engaged for a quarter of a century in the procurement of Latin American survey data, the following remarks will not be restricte to the Center's experiences. Included as main points of discussion are acquisition procedures, competition and coordination, confidentiality, and cultural imperialism. The second section reviews in some detail the history, aims, contents, and usage of the Roper Public Opinion Research Center, the world's largest Latin American survey data resource.
As anyone who has conducted research in Latin America realiz collecting data there is often an arduous task presenting problei not normally encountered in this country. First, the foreign dat collector must identify potential suppliers. While this would appear to be a relatively simple matter, it is difficult at times. In the case of survey research, for example, commercial firms and academic institutes may be small and not widely known. This is especially so in the smaller Latin American nations, although it is also common in the provincial cities of the larger
countries. The result is that American archives contain surveys conducted almost exclusively in the capital cities of the most populous Latin American nations. The few studies in their possession conducted in the smaller countries and cities were for the most part conducted by American scholars. Individual Latin American scholars conducting valuable research are even more difficult to identify, and thus, their work is poorly represented in the holdings of American data archives. To surmount these problems it is necessary for American survey research libraries to employ local representatives or have their American agents remain in Latin America for protracted periods.
Once a potential supplier has been identified, contact must be made. The collector must ascertain in advance of a visit the amount and sort of material the supplier is likely to possess. Otherwise a preoccupied supplier may claim to have nothing of interest or, in order not to offend, offer a token study. Careful planning is required. To be effective the gatherer must fully explain the purposes and operations of the data archive being represented. The presentation of a personally addressed letter from the director of the data archive and a brief description of the center written in the local language may be of value. Likewise, the cultivation of the desired association may be assisted by offering introductory letters from other suppliers of the same nationality. This procedure also may be beneficial when a rivalry between suppliers exists. Knowing that a competitor is cooperating with the data library, a supplier may wish to obtain similar visibility in the United States.
An initial contact, however, usually will not prove sufficient. Individual scholars, research institutes, and commercial firms generally place a rather low priority on providing their material to foreign data archives. Even those anxious to cooperate may not have the facilities or staff needed to duplicate and organize matierial. Waiting for delivery of promised data can be a frustratingly slow process. The collector must have considerable patience and often make repeated visits to suppliers. Even though the person with authority may be willing to collaborate with the data bank, at times the staff resents the arrangement and additional effort is necessary thus delaying data preparation.
A final problem confronted by the collector is the creation of durable relations. Despite assurances to the contrary, often additional studies are not shipped to the data archive once the gatherer has returned to the United States. Pleading letters usually have little utility in surmounting this difficulty nor will infrequent, short trips to Latin America insure a regular flow of data to the archive. Again, the data bank must in some visible form perpetuate its presence in the area.
An experienced collector employing appropriate procedures still may be unable to obtain desired data from suppliers. A major obstacle to the successful procurement of data is the
reluctance of producers to furnish studies originally conducted under contract with private firms or governments. Material of this sort often is owned by the contracting party, and the producer must get prior approval before releasing it. The client may be unwilling to provide the necessary permission, fearing that the data bank or its users will misuse the studies. Commercial firms may be concerned that their marketing studies will be made available to competitors. Governments may conclude that studies conducted for them will be exploited by political adversaries if placed in the public domain.
On the other hand, potential suppliers may withhold data in order to protect their reputations. Research organizations may be apprehensive that their procedures will be criticized by data bank users in the developed countries or that secondary analysis by careless scholars will reflect badly on them. Misgivings may also be due to a belief that rival organizations or researchers gaining access to data will learn how their staff is trained as well as other technical matters that the data producer might like to conceal. Finally, suppliers may feel that an archive will profit unduly from their labors and that it would be wiser to maintain absolute control over their products, selling them only to individual users.
A variety of arguments can be helpful in surmounting the obstacle presented by these anxieties. Suppliers may be induced to cooperate if it is suggested that by providing data to the archive they will gain exposure which might be useful in recruiting clients. They can also be told that their collaboration with the data archive will be a contribution to the educational process, the resolution of international tensions, and the exchange of ideas.
However, some data will continue to be unobtainable unless the supplier is given complete assurance that donated material will remain confidential for a specified period. This requires the data archive to establish an access classification system. Each data supplier sending study materials to the Roper Center, for example, places the individual studies in one of the following categories:
Category I: Those studies which the original data supplier: and their clients will permit the Roper Center to duplicate and rediffuse to scholars on a permanent basis, and which the recipients may in turn duplicate and rediffuse without any restriction whatsoever.
Category II; Those studies which the original data suppliei and their clients will permit the Roper Center to duplicate for loan to all scholars and for permanent placement with members of the Center's International Survey Library Association.
Category III; Those studies which the original data suppliers and their clients will permit the Roper Center to release under loan contract only.
Category IV: Those studies which the original data suppliers and their clients place at the Roper Center with
access granted only after written permission has been obtained by the supplier.
It is incumbent upon data archives to meticulously observe restrictive agreements once concluded. Failure to scrupulously abide by these arrangements will not only become widely known and discredit an archive but will almost certainly hinder the data collecting efforts of other data libraries.
Another impediment to data acquisition is competition among archives. Duplication of effort is not only wasteful, but it may cause some data suppliers to withhold material. These suppliers become weary of servicing numerous requests for identical data and therefore refuse to release their data at all. Other suppliers may attempt to profit from competition among data archives, raising their rates and selling material to the highest bidder.
At present there is very little competition among American data banks collecting in Latin America. Data pooling or interchange agreements among the major archives holding Latin American data have made the duplication of effort unnecessary. The University of Florida's Latin American Data Bank, Berkeley's International Data Library, Michigan's Inter-University Consortium for Political Research, and the Roper Center all exchange data.
However, there is some acquisition redundancy among American archives. Certainly no individual data archive is capable of assembling all of the machine readable social science data in Latin America. Since duplication benefits no one, a division of labor should be easily arranged.
One of the main missions of the now defunct Council of Social Science Data Archives (CSSDA) was to coordinate the activities of data archives, including data acquisition. Unfortunately it was not successful mainly because it lacked enforcement power and devoted most of its financial resources to quite different tasks,�e.g., creating inventories of data holdings, exploring the potential usefulness of telecommunications networks, etc.
Nevertheless, those archives now actively collecting data in Latin America generally rely upon different sources for their materials. The Roper Center is acquiring survey data from commercial firms such as The Gallup and INRA affiliates. Florida's Latin American Data Bank is collecting census materials from government agencies. Berkeley's International Data Library is gathering surveys conducted by scholars. Specialization of this sort should be continued not only to prevent duplication but to foster meaningful exchange. As in international trade, comparative advantage can make the interchange among data banks more meaningful and productive. The current allocation of responsibilities is informal and it developed spontaneously. Perhaps the Latin American Studies Association or some other appropriate organization should institutionalize it and provide necessary
supervision. This need not preclude new data banks from collecting data in Latin America. Clearly there are sufficient tasks to occupy several new entrants wishing to gather data in Latin America. Responsibility for data collection in specified countries or the acquisition of particular types of material could be assigned to these newcomers.
Cooperation among data archives interested in Latin America might also include the establishment of several joint acquisitions offices in the major Latin American countries. As was mentioned earlier, irregular short trips to Latin America are not the best means of collecting data there. A permanent presence, such as that maintained by the United States Library of Congress, is preferable, but expensive. However, by pooling their resources, there is no reason why such acquisition centers could not be created by the data archives. A joint funding proposal could even be submitted to an appropriate agency.
Another cooperative venture might consist of the development of a "union catalogue" of machine readable data relating to Latin America. Such a catalogue would make the exchange of data more routine than it is at present since each archive would be aware of other data banks' holdings. The catalogue would also contribute to the prevention of wasteful duplication in data acquisition.
The proposed catalogue could be recorded on tape and therefore easily updated by placing new entries on tape when necessary. A simple catalogue could consist of merely a one line listing of all aggregate and survey materials pertaining to Latin America held by the major data archives. A more ambitious annotated catalogue would contain the following information about each data set: (l)basic identifying data, (2) variables, and (3) sampling procedures, weighting and a list of published studie Such catalogues could be organized by country, chronologically, or by other means. Perhaps it would be worthwhile for the Latin American Studies Association or the Consortium of Latin American Studies Programs to partially subsidize this project so that the catalogue could be distributed to major research centers in the United States and Latin America.
Since the exposure of the notorious Project Camelot, quantitative data collecting in Latin America has been much more difficult. Data gatherers have had to confront apprehension, fear, and suspicion. Allegations that American data banks are involved in nefarious activities and are instruments of imperialism have not been uncommon.
It is often claimed that data stored in American data archives will be used by the United States government or multinational firms to exploit Latin America. There is anxiety that marketing data will be employed by multinational companies attempting to sell their products more effectively in Latin America. It is felt that by utilizing surveys commissioned by indigenous enterprises, multinational firms will be engaging in
economic imperialism. Actually very little market data is possessed by scholarly American data banks and those studies that are held tend to be ancient-five or ten years old-when released to users. Thus in fact they are of little value to commercial competitors.
Some Latin Americans believe that American data banks will be used for ulterior purposes by the United States government. Their concern is that data archives could pose a threat to the sovereignty of Latin American nations since they could be abused by American government officials attempting to influence the course of politics in the region. This fear is not entirely irrational. After all, in the United States there is considerable discomfort about the effect of data banks on a free society. Yet, the United States government has no need to resort to the major academic data archives. It has its own resources and facilities. Furthermore, the confidential classification systems of the private archives are applicable to government users as well as to scholars and act as a deterrent to the misuse of data.
The major concern, however, is not that American data archives are innocent tools of the United States government and multinational economic interests but that the data banks are engaged in cultural imperialism. It is charged that the data banks are acquiring materials in Latin America and providing little in return. Of course the collection of quantitative data is not exactly identical to the illegal export of Mayan sculpture from Mexico, for example. The data that is gathered is reproducible and can easily be returned to Latin America upon request. Meanwhile, much of the material acquired would probably be destroyed eventually due to inadequate archival facilities. Most important, however, is the fact that most of the data sent to American data archives consists of duplicated material, and the original material remains in the country of origin.
The current practice is for data banks to reimburse suppliers for the duplication and other costs related to the transfer and shipment of data. The supplier also often receives a nominal honorarium and methodological advice from the data archive. Occasionally other services are provided by the archive as well. The Roper Center, for instance, sends suppliers complimentary copies of all of its publications.
No doubt American data libraries could make a larger contribution to the development of the social sciences in Latin America. A greater effort could be made to assist .In the development of regional data banks in the area, although some technical assistance has already been provided. The University of Florida's Latin American Data Bank, for example, has been instrumental in the establishment of a counterpart data archive in Costa Rica. The Getulio Vargas Foundation, the Torcuato di Telia Institute, and the Brazilian State Data Bank at the Federal University of Minas Gerais also have benefited from LADB assistance.
The Roper Center would like to participate in the development of regional survey data banks in Latin America but so far
has been unable to attract adequate funding. Since there probably would not be sufficient Latin American users to make such data banks self-supporting at first, funding for a two to five year period would have to be obtained.
American data banks could also do more to facilitate the training of Latin Americans in the use and collection of machine readable data. Perhaps American data archives should attempt to secure funds to underwrite scholarships for Latin American graduate students wishing to attend the summer training sessions now being offered by the Roper Center and the University of Michigan.
Finally, fears of cultural imperialism could be allayed by the active intervention of the Latin American Studies Association or the Consortium of Latin American Studies Programs. The American Historical Association's Conference on Latin American History has established a Sub-Committee on Historical Statistics. A similar committee might be formed by LASA or CLASP. This committee could serve as a "better business bureau'.', applying moral suasion to insure that fair prices are being paid for Latin American data and that material is not being pirated. The committee also could take an active role in generating funds which would allow American data archives to provide Latin Americans with training and technical support.
Many American data banks have acquired large amounts of Latin American material. Unfortunately these data have not been fully explored or utilized by Latin Americanists. Some Latin Americanists may lack the training necessary for the manipulation of machine readable data. Others may be unaware of the richness of the data contained in the archives. Yet, still other: may find that the material contained in the data banks is inappropriate for their research. Scholars interested in Latin A-merica and the data banks must join together to overcome these problems, or the acquisition of data will be a wasted effort. Data banks must publicize their holdings and encourage the training of Latin Americanists in suitable analytical techniques. On the other hand, Latin Americanists must communicate their interests to the archives so that worthwhile materials will be collect!
Clearly the outstanding case-in-point of an incredibly rich yet virtually untapped Latin American data resource is the Roper Public Opinion Research Center. This Center presently holds the basic data from over 300 surveys conducted in virtually every major Latin American country. These studies date from the early 1950 's to the present. It is currently acquiring approximately 50 new data sets annually.
What are the aims of the Roper Center? What data and services does it offer to social scientists? How may one most effectively tap into this unique international data resource?
With Williams College (located in Williamstown, Massachu-
setts, USA) serving as the host institution, the Roper Public Opinion Research Center was founded in 1946. This unique international social science research facility has, since its founding, made available to scholars throughout the world a vast reservoir of attitude and behavior data which were formerly either difficult to obtain, or entirely inaccessible. To date, over 50,000 individuals have made use of the services and data offered by the Roper Center.
The general aims of the center are:
(1) To enrich both substantively and quantitatively the store of survey data readily available to social scientists throughout the world.
(2) To facilitate access to such data by scholars and others working in the public interest.
(3) To increase the amount of research being done with the data�research both of a long-range, basic value in furthering our understanding of human behavior as well as inquiries bearing more directly and immediately on various major problems of our times.
(4) To encourage an increasing degree of comparability in the primary sample survey research being conducted in various cultural and national milieux.
(5) To serve as a stimulus to cross-national primary and secondary research.
In 1946, the firm of Elmo Roper and Associates placed at Williams College the raw materials of their own surveys (dating from 1938) for Fortune magazine and various American industries. Since that time 117 other survey research organizations located in 68 countries have also been contributing their basic survey data. The acquisition rate at the present time is approximately 500 data sets per year.
At the moment the basic data from over 10,000 surveys are in the Center's data bank. About one-third of the data bank consists of American surveys with the remaining two-thirds coming from other parts of the world.
Included in the material for each study are the basic response data on punch cards or magnetic tape, a master coded questionnaire, relevant coding materials, and study specifications� e.g., the population universe method, method of sampling, weighting procedures, etc. Further, for many of the surveys there is available at the Center copies of research reports and public releases .
Generally, the Center provides three main types of service to its users:
(1) Search and retrieval
(2) Data set reproduction
Typically, processing the main types of service requests involve the following:
Step 1: The Center's staff searches the question index for
items relevant to the scholar's research problem. The output sent to the researcher is a listing�either computer-generated or xeroxed�of potentially relevant questions (the text of the question, the month and year of the survey, the approximate sample size, and an identification of the survey organization).
Step 2: At this stage many users request a copy of the entire questionnaire(s) containing the item(s) of central importance to their research. For later analysis they need to know what other attitudinal or behavioral variables, and demographic data were also included in the survey. Step 3�Analysis or Data Set Reproduction:
Analysis: Upon written authorization from the researcher and following his analysis specifications, the Center's technical staff processes the data. Analysis requests may involve one or more of the following:
�Marginal (total sample) results with percentages
�Multi-variable cross-tabulations with percentages (the
Center's programming capacity is at .the five variable
�If requested, the inclusion of phi, gamma, and chi square statistics.
�Response data deck x-ray (12 x 80), including the N's and percentages of each possible coded position based on the total respondent count. Data Set Reproduction: A data set consists of a punch card deck (or magnetic tape) and a codebook. Typically public opinion data are still coded on standard EAM (Electronic Accounting Machines) data cards using the column binary or multi/alphabetic punch method of coding. This is the case for about 90% of the data sets at the Roper Center. For those scholars using unit record equipment�e.g., counter-sorts, IBM 101's, etc.�no real problem exists. A problem does exist, however, for those users requesting duplicates of the original data with the intention of reading and analyzing the information on computers. While most computing systems with modification or additional hardware can read and analyze multi-punch cards or binary coded tapes few have this feature as standard equipment. Even though the Center maintains its data files in their original form, it has the capacity (necessary software, etc.) to recode multi-punched data into expanded acceptable BCD card or tape records. The Center has in its computer system the necessary hardware to generate tapes written in a mode compatible with most comput facilities throughout the world.
The cataloging system used by the Center is two-fold. An up-to-date inventory of every data set located at the Center is maintained. Each study is very briefly cited in this inventory, (the Latin American inventory is found in Appendix A).
In addition, using the individual question as the basic in-
dexing unit, each question in each survey is assigned a major and minor category tag or label. Many of the items are multiple-filed�i.e., tagged with two or more major and minor category headings. (See Appendix B for examples of item index cards.) Once tagged, the questions with tags are transposed to machine-readable form so that question listings can be computer-generated. Question listings are produced mainly in terms of substantive categories, but the index is so constructed that users can specify in addition to topic or subject matter, particular countries, particular time points,particular survey research organizations, etc. The following is an illustrative set of major and minor categories used by the Center:
Labor: Absenteeism AFL and CIO Automation Child Labor Closed Shop Co-determination Conscription of Disputes, Labor-Management Efficiency
Fringe Benefits (see also: Job Satisfaction)
Government and, General
Guaranteed Annual Wage
Hiring the Handicapped
Job, Availability of (see: Economic Affairs, Unemployment Job, Duration of Job Satisfaction Jurisdictional Disputes Leaders
Level of Information Minimum Wages
Number of Employees (see: Working Conditions)
Political Activity Portal to Portal Pay Productivity
Retirement (see: Economic Affairs) Right to Work Laws
Strikes (for Strikes by Teachers, see: Education) Taft-Hartley Law Training of
Unemployment (see Economic affairs, unemployment) Union shop
Unions, attitude toward
communism in (see Communism) corruption in
Government Regulation of
War Effort of Wage Controls Wages
Women and (see also: Marriage and Family)
Working Contions (see also: Business, Public Relations)
Latin America: Argentina
Dominican Republic General Guatemala Interest in Leaders
Level of Information Mexico
U.S. Policy Toward
World War II and, General
Minorities and Ethnocentrism: Aliens
Catholics (see: Religion) Employment of Minorities Ethnic Groups Freedom Riders General
German-Americans Government and, General Housing Problems Indians (American) Italian-Americans Japanese-Americans Jews
Ku Klux Klan
Level of Information
Negroes (see also: Particular Problem under
Minorities) Personal Relations with Minorities Press and (see: Press, Abuse of Power) Refugees, Attitude Toward Regionalism
Restrictions on Minorities (see also: Housing
Problems; Employment of) Segregation, General Armed Forces Education Sports
United States War Involvement and (see also:
Japanese-Americans) Voting Restrictions In addition, the Center has developed a supplementary index of the factual information recorded on each individual survey. The availability of this retrieval tool permits scholars whose primary interest is in analyzing particular sub-populations-e.g., ethnic, economic, religious groups, etc.�to gain immediate access to those surveys in which such information was recorded. The following is an excerpt from the coding categories for the Center's face data index: Item: Age:
Birth: by country
Birth: by section and/or state
Community classification actual size Community classification: rural-urban Economic level: Interviewer rating Economic level: Objective rating Education: General Level Education: By Grade Ethnic Group
Family Member in Service
Family Member of Military Age
Geographic Location: Region
Geographic Location: State
Memb ers hip: Chur ch
Membership: Fraternal Organization
Membership: Civic Organization
Membership: Labor Union
Membership: Veteran's Organization
Occupation: Of Respondent
Occupation: Of Chief Wage Earner
Most of the scholars served by the Center to date have been associated in a teaching and/or research capacity with academic institutions. Of this group, which constitutes approximately 90% of the total, about one third are professional sociologists, one third are political scientists, and the remaining third includes economists, psychologists, and historians.
Other users have been associated with independent research organizations, such as the Brookings Institution, the American Jewish Committee, the Rand Corporation, or federal, state and local government agencies in the United States and abroad. Among the American governmental groups which have used the Center are the U.S. Public Health Service, and the Office of Education.
The research topics for which the Center's data have proven of value have ranged widely. Many scholars have made extensive use of the data in connection with long range research on the basic problems in human behavior. Others have found the data of value in conjunction with research of an applied nature, i.e., development and implementation of policy on current social problems, both domestic and international.
The following studies are illustrative:
(1) The Youthful Delinquent Behavior of Today's Respectable Man. The major aim of the research (career Patterns Project) was to test the hypothesis that youthful delinquency in various degrees, characterizes the adolescent behavior of the majority of today's respectable male adults. Clues were sought as to why and under what circumstances such socially disapproved behavior was abandoned.
(2) Public Knowledge of Science. An analysis of the areas of scientific information and ignorance among the American public. Specifially the research dealt with such questions as the relationship between an individual's education and his science knowledge, the role of the masi media in increasing scientific information, and methods of increasing scientific knowledge among the American public.
(3) Prejudice and Desegregation. A systematic analysis of regional differences in authoritarianism.
(4) Public Opinion on Religion and Public Education. An examination of public attitudes toward the place of religion in public education, including special study of sucl topics as released time classes, federal aid to church schools, compulsory prayers in public school classes, restrictions upon church youth group activities in public educational institutions, and restrictions on curriculum content in public schools.
(5) Public Opinion on Non-Military Aid for Economic Development . An analysis of public opinion in relation to U-nited States economic aid to underdeveloped nations. Specifically the study focused on such facets of economii aid programs as grants versus loans, and bilateral versus multilateral aid.
In recent years, as a result of the steadily enlarging non-American segment of the Center's data archive there has been an increasing opportunity for scholars to carry out cross-national, comparative studies. The following inquiries are illustrative
of such secondary analyses, all of which employed non-American materials, and some of which were cross-national research efforts:
(1) "Problems of Political Consensus in Canada, the United States, Great Britain, and Australia."
(2) "Italian Attitudes on the Role of Religion in Politics."
(3) "Attitudes towards Citizenship in Great Britain, Germany, Italy, and Mexico."
(4) "Economic Development, Foreign Investment, and Nationalist Political Demands in Venezuela between 1945 and 1958." A study of the atti
(5) "Policies of United States Business in Underdeveloped Countries." A study of the attitudes of the people toward those policies of American business which affect the workers' standard of living�e.g., such services as health, housing, etc.
(6) "The Effect of the Existence of Various European International Organizations on the French Government."
(7) "Attitudes towards the Space Program of the United States: Reactions to the Differential Achievements of the United States and the USSR." (various foreign countries)
(8) "Reaction of the People of the Far East to the Impact of Western Technology and Western Culture."
(9) "Comparative Study of Occupational Aspirations of German and American Youth."
Many articles, books, and scholarly papers have been published based wholly or in part on data obtained through the Roper Center. (Appendix C contains a selected bibliography.)
During the past two years the Center has added several new services. One major new service is the Center's summer training institute. The inaugural session of the Institute will take place July 1 through August 15, 1973. Its main purpose is to provide an intensive training program for undergraduates in the use of data banks, the design and execution of surveys, and secondary analysis statistical techinques.
The Center has also begun to expand its professional staff specifically for the purpose of developing a number of different research programs in specific substantive areas. Already underway is a program of secondary research in the area of family planning and fertility. Future plans call for a similar effort in the areas of race problems and international relations.
Finally the Center is embarking on a substantially expanded program of communication and publication. In 1972 it began a monthly newsletter entitled Current Opinion. Its purpose is to make a contribution toward keeping open the lines of communication between the public and those in positions of special responsibility. Current Opinion includes results of recent surveys conducted by leading opinion research organizations in the United States and abroad. Its readership target groups include community, corporate and labor leaders, those in government, and educators, all of whom are vitally concerned in knowing the substance of and
trends in public opinion.
Inquiries should be directed to Philip K. Hastings, Director, Roper Center, Williams College, Williamstown, Massachusetts 01267.
�LThe International Social Science Council is conducting an in ventory of current survey research in Latin American and is currently compiling a list of studies conducted in Peru and Brazil.
INVENTORY OF LATIN AMERICAN SURVEYS AT THE ROPER CENTER�AS OF MAY, 1973
IBPOPE INSTITUTO BRASILEIRO DE OPINAO PUBLICA E
IGOP INSTITUTO GALLUP DE OPINAO PUBLICA
IISR INSTITUTE FOR INTERNATIONAL SOCIAL RESEARCH
USIA UNITED STATES INFORMATION AGENCY
IPSA INSTITUTO IPSA, S.A.
INESE INSTITUTO DE ESTUDOS SOCIAIS E ECONOMICOS
INRA INTERNATIONAL RESEARCH ASSOCIATES, INC.
HARPLAN PESQUISAS E ESTUDOS DE MERCADO LTDA.
IUDOP INSTITUTO URUGUAYO DE LA OPINION PUBLICA
Year Title Sample Organization
1961 International Relations National s/s=1712 IPSA-USIA LA-8
1962 Alliance for Progress National s/s=1540 (x2) INRA-USIA LA-13
1963 Family Planning (KAP) Detailed information on request
1963 International Relations Buenos Aires s/s=481 (x2) IPSA-USIA WS-I
1964 International Relations Buenos Aires s/s-517 (x2) IPSA-USIA WS-II
1965 International Relations Buenos Aires s/s=507 (x2) INRA-USIA WS-III
1967 The Arab-Israeli War Buenos Aires s/s=223 (Form A) INRA #505
=185 (Form B)
1969 Omnibus Survey s/s= IPSA #609
1971 Mendoza Dailies Mendoza s/s=601 (x2) IPSA #557
1955 International Relations Urban s/s=489 IPOM-USIA LA-1
1960 Omnibus National s/s-2739 IISR #15
1960 Parliamentarians Survey s/s=100 IISR #16
1961 Omnibus Urban s/s=1739 IPOM-USIA LA-8
1961 Law Students Survey Law Students s/s=1139 MARPLAN #0001
1962 Alliance for Progress National s/s=5700 INESE-USIA LA-13
1963 World Survey I (Selected Political Issues) Rio de Janeiro s/s=392 IPON-USIA WS-I
1963 Brazilian Images of U.S. University Students s/s=887
(x7) Aaron Feinsot
Year Title Sample Organization
1963 Family Planning (KAP) Detailed information on request
1964 International Affairs Rio de Janeiro s/s=466 (x2) MARPLAN-USIA WS-II
1965 International Affairs Rio de Janeiro s/s=501 (x2) MARPLAN-USIA WS-II
1966 Family Planning Detailed information on request
1967 Youth Survey Urban s/s=1066 MARPLAN #106
1967 Omnibus Rio de Janeiro s/s=312 MARPLAN #JB-43
1967 Soluble Coffee Dispute Rio & Sao Paulo (Men only) IBOPE #P.25
1967 Omnibus Sao Paulo s/s=653 (x3) IGOP #001
1967 Omnibus Sao Paulo s/s=618 (x3) IGOP #002
1967 Omnibus Sao Paulo s/s=642 (x2) IGOP #003
1967 Omnibus Sao Paulo s/s=629 (x2) IGOP #004
1967 Omnibus Sao Paulo s/s-596 (x3) IGOP #005
1968 Media Study Rio de Janeiro s/s=4390 MARPLAN #0010
1968 Omnibus Sao Paulo s/s=617 IGOP #007
1968 Omnibus Sao Paulo s/s=714 IGOP #008
1968 Omnibus Sao Paulo s/s=548 IGOP #010
1968 Omnibus Sao Paulo s/s=617 IGOP #011
1968 Political Survey Piracicaba s/s=288 IBOPE P.02
1968 Household Products Consumers s/s=1000 IBOPE
1968 Insurance for Private Cars Car Owners s/s=498 IBOPE #0001
1968 Profile of Students Students s/s=286 IBOPE #0002
1968 Toilet Product Consumers s/s=1000 (x2) IBOPE
1968 Attitudes toward Parents 13-16 years old Rio & Sao
Paulo s/s=200 IBOPE #0003
1968 Attitudes toward Parents 7-12 years old Guanabara &
Sao Paulo s/s=200 IBOPE #0004
1968 Political Survey
1968 Political Survey
1968 Insurance for Taxis
1968 Political Survey
1968 Political Survey
1968 Political Survey
1968 Political Survey
1968 Public Administration Survey
1968 Political Survey
1968 Birth Control
1968 Political Survey
1968 Political Survey
1968 Political Survey
1968 Political Survey
1968 Medical Services
1968 Medical Questionnaire
1968 Survey of Medical Students
1968 International Affairs
Itapetiningo s/s=277 Aparacida s/s=245 s/s=1000 (x2) Taxi Drivers s/s=176 Marajuara s/s=290 Sao Paulo s/s=504 Garca s/s=280
Ponto Grossa, Parana s/s=295
Rio de Janeiro s/s=500
National Adult s/s=300
Sao Bernardo s/s=300
Medical & Administrative
Personnel and 5th & 6th Year
Medical Students s/s=402
3rd & 4th Year Medical
IBOPE P, .23
IBOPE P. ,27
IBOPE P, .23
IBOPE P, .29
IBOPE P, .30
IBOPE P, .31
IBOPE P. .09
IBOPE P, .20
IBOPE P .22
IBOPE P .24
IBOPE #107 MARPLAN JB-45 MARPLAN JB-46 MARPLAN JB-50 MARPLAN JB-52 MARPLAN JB-53 MARPLAN JB-54
Year Title Sample Organization
1968 Omnibus Urban s/s=303 MARPLAN JB-55
1968 Omnibus Urban s/s=310 MARPLAN JB-56
1968 Omnibus Urban s/s=305 MARPLAN JB-57
1968 Omnibus Urban s/s=320 MARPLAN JB-58
1968 Omnibus Urban s/s=323 MARPLAN JB-63
1968 Omnibus Urban s/s-330 MARPLAN JB-64
1968 Omnibus Urban s/s=310 MARPLAN JB-65
1968 Omnibus Urban s/s=322 MARPLAN JB-66
1968 Omnibus Urban s/s=304 MARPLAN JB-68
1968 Omnibus Urban s/s=325 MARPLAN JB-69
1968 Omnibus Urban s/s=327 MARPLAN JB-70
1969 Omnibus Urban s/s=312 MARPLAN JB-71
1969 Omnibus Urban s/s=307 MARPLAN JB-72
1969 Omnibus Urban s/s=324 MARPLAN JB-73
1969 Omnibus Urban s/s=321 MARPLAN JB-74
1969 Omnibus Urban s/s=327 MARPLAN JB-75
1969 Omnibus Urban s/s=324 MARPLAN JB-76
1969 Omnibus Urban s/s=324 MARPLAN JB-77
1969 Omnibus Urban s/s=314 MARPLAN JB-78
1969 Omnibus Urban s/s=325 MARPLAN JB-79
1969 Omnibus Urban s/s=326 MARPLAN JB-80
1969 Omnibus Urban s/s=306 MARPLAN JB-81
1969 Omnibus Urban s/s=315 MARPLAN JB-82
1969 Omnibus Urban s/s=326 MARPLAN JB-83
1969 Omnibus Urban s/s=323 MARPLAN JB-84
1969 Omnibus Urban s/s=329 MARPLAN JB-85
1969 Omnibus Urban s/s=315 MARPLAN JB-86
Year Title Sample Organization
1969 Omnibus Urban s/s=316 MARPLAN JB-87
1969 Omnibus Urban s/s=310 MARPLAN JB-88
1969 Omnibus Urb an s/s=318 MARPLAN JB-89
1969 Omnibus Urban s/s=309 MARPLAN JB-90
1969 Omnibus Urban s/s=311 MARPLAN JB-91
1969 Omnibus Urban s/s=321 MARPLAN JB-92
1969 Omnibus Urb an s/s=300 MARPLAN JB-93
1969 Omnibus Urban s/s-304 MARPLAN JB-94
1969 Omnibus Urban s/s=323 MARPLAN JE-95
1969 Omnibus Urban s/s=330 MARPLAN JB-96
1969 Omnibus Urb an s/s=317 MARPLAN JB-97
1969 Omnibus Urban s/s=301 MARPLAN JB-98
1969 Omnibus Urban s/s=312 MARPLAN JB-99
1969 Omnibus Urban s/s=303 MARPLAN JB-100
1969 Omnibus Urb an s/s-317 MARPLAN JB-101
1969 Omnibus Urb an s/s-324 MARPLAN JB-102
1969 Omnibus Urban s/s=318 MARPLAN JB-103
1969 Omnibus Urban s/s=324 MARPLAN JB-104
1969 Omnibus Urban s/s=331 MARPLAN JB-105
1969 Omnibus Urb an s/s=308 MARPLAN JB-106
1969 Customs of the Rio de Janeiro Citizen Rio de Janeiro s/s= 322 MARPLAN JB-107
1969 Omnibus Urban s/s=318 MARPLAN JB-108
1969 Current Events Rio de Janeiro s/s= 338 MARPLAN JB-109
1969 Current Events Rio de Janeiro s/s= 310 MARPLAN JB-110
1969 Current Events Rio de Janeiro s/s= 312 MARPLAN JB-111
1969 Omnibus Urban s/s=301 MARPLAN JB-112
1969 Omnibus Urb an s/s=*319 MARPLAN JB-113
Year Title Sample Organization
1969 Omnibus Urban s/s=303 MARPLAN JB-114
1969 Teaching in School University Students s/s=320 MARPLAN JB-115
1969 Current Events Urban s/s=313 MARPLAN JB-116
1969 Current Events Urb an s/s=309 MARPLAN JB-117
1969 Omnibus Urban s/s=300 MARPLAN JB-118
1969 Omnibus Urb an s/s-300 MARPLAN JB-119
1969 Omnibus Urban s/s=300 MARPLAN JB-120
1969 Omnibus Urban s/s=300 MARPLAN JB-121
1970 Omnibus Urban s/s=300 MARPLAN JB-122
1970 Omnibus Urb an s/s=300 MARPLAN JB-123
1970 Omnibus Urban s/s=300 MARPLAN JB-124
1970 Omnibus Urban s/s-311 MARPLAN JB-125
1970 Omnibus Urban s/s=299 MARPLAN JB-126
1970 Omnibus Urban s/s=329 MARPLAN JB-127
1970 Omnibus Urban s/s=314 MARPLAN JB-129
1970 Omnibus Urban s/s=324 MARPLAN JB-130
1970 Omnibus Urban s/s=317 MARPLAN JB-131
1970 Omnibus Urb an s/s=303 MARPLAN JB-132
1970 Omnibus Urb an s/s=325 MARPLAN JB-133
1970 Omnibus Urban s/s-321 MARPLAN JB-134
1970 Omnibus Urban s/s-325 MARPLAN JB-135
1970 Omnibus Urban s/s=316 MARPLAN JB-136
1970 Omnibus Urb an s/s=306 MARPLAN JB-137
1970 Omnibus Urban s/s=318 MARPLAN JB-138
1970 Omnibus Urb an s/s=311 MARPLAN JB-139
1970 Omnibus Urban s/s=324 MARPLAN JB-140
1970 Omnibus Urb an s/s=312 MARPLAN JB-141