GlossaryASCIIAmerican Standard Code for Information Interchange. A character encoding scheme used by many computers. The ASCII standard uses 7 of the 8 bits that make up a byte to define the codes for 128 characters. Example: in ASCII, the number seven is a treated as a character and is encoded as: 00010111. Because a byte can have a total of 256 possible values, there are an additional 128 possible characters that can be encoded into a byte, but there is no formal ASCII standard for those additional 128 characters. Most IBM-compatible personal computers do use an IBM extended character set that includes international characters, line and box drawing characters, Greek letters, and mathematical symbols.
CA programming language.
CBw.dInstructions that the SAS System uses to read standard numeric values from column-binary files, translating the data into standard binary format. The w value specifies the width of the variable, usually 8, but has a range between 1 and 32. The d value specifies the number of digits to the right of the decimal point in the numeric value.
cardAlso known as deck, a physical record of data. A survey may have multiple cards for each respondent, all cards together comprising a logical record. Based on the IBM punch cards of 80-column length.
caseThe unit of analysis in a particular data file. Can be an individual respondent to a questionnaire, a customer, or an industry. In the Roper Reports, each case is an interview respondent.
codebookDescription of the organization and content of a data file. Contains the code ranges and the code meanings needed to interpret the data file.
column binaryA code originally used with punched cards in which successive bits are represented by the presence or absence of punches in contiguous positions in columns. Using this method, responses to more than one question can be stored in a single column.
data dictionaryA file, part of a file, or part of a printed codebook containing information about a data file, including the name of the element, its format, location, and size.
Data Documentation Initiative (DDI)An international committee sponsored by ICPSR that is developing a new metadata standard for social science documentation. This standard, developed by representatives from the international social science research community, is intended to fill the need for a structured codebook standard that will serve as an interchange format and permit the development of new Web applications. The Document Type Definition (DTD) for the DDI standard is written in XML (Extensible Markup Language) and is available at http://www.icpsr.umich.edu/DDI/.
deckAlso known as card, a logical record of data. A survey may have multiple cards for each respondent, all cards together comprising a logical record. Based on the IBM punch cards of 80-column length.
documentationInformation that accompanies a data file, describing the condition of the data, the creation of the file, the location and size of variables in the file, and the values (or codes) of the variables.
export fileA file produced by a software package that is designed to be read on another computer, often with a different operating system, running a version of the same software package.
HTMLHyperText Markup Language
ICPSRInter-university Consortium for Political and Social Research
informatThe instructions that specify how SAS reads the numbers and characters in a data file.
intermediate variableA variable used when recoding data to input information from individual punches in multipunch data. Sets of intermediate variables are then recoded to produce final variables.
logical recordA complete unit of data for a particular unit of analysis, in this project a single respondent. Multiple physical records, called cards or decks, may make up a logical record.
missing valueA value code that indicates no data are present for a variable for a particular case. To be distinguished from non-response values (respondent refused to answer or was not asked the question) and from invalid responses (the response did not have a valid value code equivalent). Non-responses and invalid responses may or may not have value categories provided in the questionnaire and may be treated differently from true missing data during analysis.
multipunchedA way of recording data, originally used with punched cards, in which successive bits are represented by the presence or absence of punches in contiguous positions in columns. Using this method, responses to more than one question can be stored in a single column.
OCROptical character recognition
PDFPortable Document Format, a published standard format developed by Adobe Systems, accessed with proprietary software. punch cardA paper medium used for recording computer-readable data. The card is punched by a special machine called a keypunch that works like a typewriter, except that it punches holes in cards instead of typing characters on paper. The punch cards are then processed with a card reader that transfers the punched information to a computer-readable digital format.
PUNCH.dInstructions that the SAS system uses to read standard numeric values from column binary files. The d value specifies which row in a card column to read. Valid values for the d value are 1 through 12.
questionnaireThe set of questions asked in a survey. In the Yale Roper Collection, the questionnaire, with columns and codes written next to the question, may substitute for a codebook.
recodeChanging the value code of a variable from one value to another. For example, changing 0 and 1 values in column binary data files to value ranges of 0 through 12. Also known as data transformation.
respondentIn survey research, the person responding to the survey questions.
ROWw.dInstructions that the SAS system uses to read a column-binary field down a card column. The w value specifies the row where the field begins, with a range between 1 and 12. The d value specifies the length in rows of the field. Valid values for d are 1 through 25, with the default value of 1. The informat assigns the relative position of the punch in the field range to a numeric variable.
SASSet of proprietary computer programs used for analysis of social science statistical data. (No longer an acronym; originally stood for Statistical Analysis System.)
SGMLStandard General Mark-Up Language
SPSSStatistical Package for the Social Sciences. Set of proprietary computer programs used for analysis of social science statistical data.
single-punchA single response coded in a column.
split sampleA method of data collection in which one group of respondents is queried with one form of a questionnaire and the second group is queried with a different form of the questionnaire.
spreadRecoding multiple responses that have been coded in a single column of a record to a separate column for each response. system fileA data file or collection of data files specifically formatted for a particular software package; may not be readable by other software packages.
TIFFTagged Image File Format
valuesThe numeric or character equivalents for a particular variable in a data file.
variableAn item in a data file to which a value has been assigned. A data file contains the values of certain variables measured for a set of cases. In the Roper Report data files, variables are responses to questions or parts of questions from each person interviewed.
XMLExtensible Markup Language (XML) is a data format for structured document interchange on the Web.
xrayA form of output that is organized by card, column, and row; each bit has its own unique location within this framework. The total number of punched bits across all observations is recorded for each location in the data set. This sum often provides a response frequency for individual response options.
Sources of Glossary Terms
Armor, David J., and Arthur S. Couch. 1972. Data-text Primer: An Introduction to Computerized Social Data Analysis. New York: The Free Press.
Dodd, Sue A. 1982. Cataloging Machine-readable Data Files: An interpretive Manual. Chicago: American Library Association.
Dodd, Sue A., and Ann M. Sandberg-Fox. 1985. Cataloging Microcomputer Files: a Manual of Interpretation for AACR2. Chicago: American Library Association.
Geda, Carolyn L. [n.d.] Data Preparation Manual. Sponsored by John D. Peine, Project Coordinator, Heritage Conservation and Recreation Service, U.S. Department of the Interior.
Jacobs, Jim. Glossary of Selected Social Science Computing Terms and Social Science Data Terms. University of California, San Diego. Available at http://odwin.ucsd.edu/glossary/index.html.
SAS Institute. 1990. SAS Language: Reference. Version 6, 1st ed. Cary, NC: SAS Institute. Sippl, Charles J. 1966. Computer Dictionary. Indianapolis: Howard W. Sams & Co., Inc.
Sippl, Charles J., and Roger J. Sippl. 1980. Computer Dictionary. 3rd ed. Indianapolis: Howard W. Sams & Co, Inc.
Spencer, Donald D. 1968. Computer Programmer's Dictionary and Handbook. Waltham, MA: Blaisell Publishing Company.
SPSS, Inc. 1988. SPSS-X User's Guide. 3rd ed. Chicago: SPSS, Inc.
Weik, Martin H. 1969. Standard Dictionary of Computers and Information Processing. New York: Hayden Book Company.
|