Primary Data Sources
This section provides summary descriptions of primary data sources and links to more detailed survey information.
The data in this report come from surveys conducted by the National Center for Science and Engineering Statistics (NCSES) within the National Science Foundation (NSF), other federal agencies, and nonfederal organizations. Users should take great care when comparing survey data from these different sources. Differences in definitions, survey procedures, and phrasing of questions, among other things, make these data less than strictly comparable. Efforts have been made to maintain consistency throughout these tables, but it has sometimes been necessary, for accuracy, to use distinct terminology that does not match that used in other tables.
The collection and reporting of race and ethnicity data pose several problems. First, both the naming of population subgroups and their definitions have changed over time. Second, many of the groups of particular interest are quite small, so it is difficult to measure them accurately without larger samples or surveys of the entire population of interest. In some instances, sample surveys may not have had sufficient sample size to permit the calculation of reliable racial or ethnic population estimates for all groups; consequently, data are not always shown for some groups. For example, the Bureau of Labor Statistics’ Current Population Survey does not provide data on unemployment among American Indians or Alaska Natives. Third, data on race and ethnicity are often based on self-identification. Fourth, it is easy to overlook or minimize heterogeneity within racial or ethnic subgroups when only a single statistic is estimated for their entire population.
In October 1997, the Office of Management and Budget (OMB) announced new government-wide standards for the collection of data on race and ethnicity (https://obamawhitehouse.archives.gov/omb/fedreg_1997standards) that became effective 1 January 2003. OMB specified the following categories and definitions of racial and ethnic groups:
Respondents can also select one or more racial designations, and those who do are classified under “more than one race.”
The Department of Education published final guidance in the Federal Register on 19 October 2007 (72 Fed. Reg. 59267) to transition to the new OMB standards for reporting race and ethnicity. Previously, the Department of Education’s National Center for Education Statistics (NCES) had identified mutually exclusive racial and ethnic groups as white, black, Hispanic, Asian or Pacific Islander, and American Indian or Alaska Native. In 2008, NCES changed race and ethnicity reporting for degree completion data and for enrollment data. For the degree completion data, reporting in the new categories became mandatory for the 2011–12 data collection (i.e., 2011 data). For the fall enrollment data, reporting in the new categories became mandatory for the 2010–11 data collection (i.e., 2010 data). However, institutions were not required to update the race and ethnicity data of individuals who were already in their systems. In this report, the racial and ethnic groups detailed in Integrated Postsecondary Education Data System (IPEDS) tables, which run through 2016, incorporate OMB’s new race and ethnicity reporting standards for all years for which data are provided. For more information, see https://www2.ed.gov/policy/rschstat/guid/raceethnicity/index.html.
In this report, racial and ethnic information is shown only for U.S. citizens and permanent residents in NCES data and graduate enrollment data.
High-Hispanic-enrollment institutions (HHEs) are nonprofit public and private institutions of higher education whose full-time equivalent (FTE) enrollment of undergraduate students is at least 25% Hispanic. The FTE enrollment of Hispanic students is determined by enrollment data that institutions reported to the fall 2016 IPEDS Enrollment survey conducted by NCES. NCES determined FTE enrollment by estimating that approximately three part-time students are equivalent to one full-time student. Because IPEDS does not collect part-time credit hour information, the FTE numbers are only an approximation. The list includes only nonprofit public and private institutions of higher education.
Historically black colleges and universities (HBCUs) are academic institutions listed by the White House Initiative on Historically Black Colleges and Universities. The Higher Education Act of 1965, as amended, defines an HBCU as “any historically black college or university that was established prior to 1964, whose principal mission was, and is, the education of black Americans, and that is accredited by a nationally recognized accrediting agency or association determined by the Secretary [of Education] to be a reliable authority as to the quality of training offered or is, according to such an agency or association, making reasonable progress toward accreditation.” See https://sites.ed.gov/whhbcu/one-hundred-and-five-historically-black-colleges-and-universities/.
Tribal colleges are 32 fully accredited academic institutions on a list maintained by the White House Initiative on Tribal Colleges and Universities. See https://sites.ed.gov/whiaiane/tribes-tcus/tribal-colleges-and-universities/.
For several reasons, data on people with disabilities have limitations. First, the operational definitions of disability may vary across a wide range of physical and mental impairments, and these definitions may not be comparable. The Americans with Disabilities Act of 1990 (ADA) encouraged progress toward standard definitions. Under ADA, an individual is considered to have a disability if he or she has a physical or mental impairment that substantially limits one or more of his or her major life activities, has a record of such impairment, or is regarded as having such impairment. ADA also contains definitions of specific disabilities.
Second, data on disabilities frequently are not included in comprehensive institutional records (e.g., in registrars’ records in institutions of higher education). If included at all, such data may be kept only in confidential files at an office responsible for providing special services to students. Institutions of higher education are unlikely to have information regarding students with disabilities who have not requested that they be provided with special services related to their disabilities. Whereas in elementary or secondary school programs that receive funds to provide special education, statistics on all students identified as having special needs are centrally available.
Third, information about people with disabilities that is gathered from surveys is often obtained from self-reported responses. Typically, respondents are asked to state whether they have any specified physical, mental, or sensory impairment or limitation in order to classify them as having a disability. Resulting data therefore reflect individual perceptions of their functioning, rather than more objective measures of functioning that use standardized criteria such as those used in clinical studies of disability.
Fourth, some surveys have recently changed the wording of their questions on disability, lowered thresholds for defining disability, or both. This has resulted in an increase in the number of individuals with disability as measured.
Variation in estimates of the proportion of the undergraduate student population with disabilities is evidence of the limitations of these data. Self-reported data on the undergraduate student population, collected through an NCES survey to ascertain patterns of student financial aid, suggest that about 19.5% of this population have some form of disability. The estimate of those with disability who are doctoral recipients is lower, 8%. Estimates from Census Bureau surveys, in contrast, place the estimate of the U.S. civilian population with disability at 13% (all ages) and 6% (age 18–34 years). It is difficult to ascertain whether this discrepancy is the result of self-perception, incomplete reporting, disabilities that are not evident, differing definitions, or all of these effects.
Sources of data on people with disabilities cited in this report include the National Postsecondary Student Aid Study (NPSAS), conducted by NCES; the National Survey of College Graduates (NSCG), Survey of Doctorate Recipients (SDR), and the Survey of Earned Doctorates (SED), all conducted by NCSES; and the American Community Survey (ACS), conducted by the Census Bureau. These sources are described in more detail later in this appendix; the following is a brief description of how each source treats the issue of disability.
NPSAS (2016) asked students whether they had serious difficulty hearing; seeing; concentrating, remembering, or making decisions; or walking or climbing stairs. Respondents who answered "yes" for any activity are classified as having a disability. NCES states that NPSAS (2016) estimates of students with disabilities should be interpreted with caution and not compared to prior NPSAS data because of changes to the wording of the interview questions used and the instruction to include but not limit their responses to “a serious learning disability, depression, ADD [attention deficit disorder], or ADHD [attention deficit hyperactivity disorder].”
The ACS (2016) asks a series of yes or no questions about each individual at a sampled address as to whether the person has serious difficulty hearing; seeing; concentrating, remembering, or making decisions; walking or climbing the stairs; dressing or bathing; or doing errands alone. The Census Bureau categorizes respondents with one or more “yes” responses as having a disability.
The SED (2017), SDR (2017), and the NSCG (2017) all have the same series of disability questions. The respondent is asked, “What is the USUAL degree of difficulty you have with” seeing; hearing; walking; lifting 10 pounds; and concentrating, remembering, or making decisions. The response choices are none, slight, moderate, severe, and unable to do. Respondents who report “moderate,” “severe,” or “unable to do” for any activity were classified as having a disability.
This section provides summary descriptions of primary data sources and links to more detailed survey information.
The following sources from NCSES were used for data tables in this publication. Published data tables from these surveys can be accessed on the NCSES website at https://www.nsf.gov/statistics/. In addition, researchers may access data directly from the NCSES Interactive Data Tool (https://ncsesdata.nsf.gov/ids/) or the Scientists and Engineers Statistical Data System (SESTAT) Data Tool (https://ncsesdata.nsf.gov/sestat/sestat.html).
The Survey of Earned Doctorates (SED) is an annual census of individuals who earned a research doctorate from an accredited U.S. academic institution. The most common research doctorate degree is the doctor of philosophy (PhD). Recipients of professional degrees, such as the juris doctor (JD) and doctor of medicine (MD), are not included in the SED. Data are collected directly from individual doctorate recipients contacted through their university. Responses were gathered primarily using a Web-based questionnaire with a small number of responses from paper questionnaires and computer assisted telephone interviews. The recipients are asked to provide information about their field of doctoral study, educational history, postgraduate plans for work and further study, and demographic characteristics. Since the survey’s inception in 1957, more than 90% of the annual cohort of doctorate recipients has typically responded to the questionnaire each year.
For individuals who do not respond to the SED, data that are available from public sources (e.g., field of doctorate) are added to the file. No adjustments are made for nonresponse, and no imputation is used for missing items among respondents. The data for a given year include all doctorates awarded in the 12-month period ending on 30 June of that year.
The SED is sponsored by six federal agencies: NSF, the National Institutes of Health, the Department of Education, the National Endowment for the Humanities, the Department of Agriculture, and the National Aeronautics and Space Administration. Further information about the SED can be found at https://nsf.gov/statistics/srvydoctorates/.
The Survey of Graduate Students and Postdoctorates in Science and Engineering, more commonly referred to as the Graduate Students Survey (GSS), is an annual census of all U.S. academic institutions granting research-based master’s degrees or doctorates in science, engineering, and selected health fields as of fall of the survey year. The survey, sponsored by NSF and the National Institutes of Health, collects the total number of graduate students, postdoctoral appointees (postdocs), and doctorate-level nonfaculty researchers by demographic and other characteristics, such as source of financial support. Results are used to assess shifts in graduate enrollment and postdoctoral appointments and trends in financial support.
The survey collects data from institutions’ branch campuses, affiliated research centers, medical schools, schools of nursing, and schools of public health. The 2016 survey covered 714 academic institutions. Data are collected separately for each eligible organizational unit (academic department or program, research center, or health facility).
Approximately 99% of institutions and affiliated units respond to the survey. Missing data for nonresponding units are imputed by using prior years’ data, when available, or by using data provided from similar units at a peer institution.
In 2016, the survey included a pilot data collection designed to assess the feasibility of (1) reporting master’s and doctoral student data separately, (2) using Classification of Instructional Programs codes for reporting GSS data, and (3) expanding the use of file uploads for data submission. Data provided by pilot coordinators are included in the 2016 data products.
In 2014, the survey frame was updated following a comprehensive frame evaluation study that identified potentially eligible but not previously surveyed U.S. academic institutions. A total of 151 newly eligible institutions were added, and two private for-profit institutions offering mostly practitioner-based graduate degrees were removed as no longer eligible. For more information, see https://www.nsf.gov/statistics/srvygradpostdoc/.
Due to the 2014 methodological changes and other changes that have occurred in recent cycles, care should be used when assessing trends within the GSS data.
Reporting of race and ethnicity since 2008 is likely to have been affected by changes in reporting in IPEDS. Starting in 2008, IPEDS respondents were asked to use a new race and ethnicity classification that included a category for persons who are not Hispanic, a category for persons who identify with more than one race, and a category for Native Hawaiians and Other Pacific Islanders, separate from Asians. The new classification was optional in 2008 and 2009 IPEDS but mandatory in 2010, and it may have contributed to a significant increase in GSS reporting of “Not Hispanic or Latino, more than one race” within the GSS data.
Further information about GSS can be found at https://www.nsf.gov/statistics/srvygradpostdoc/.
The National Survey of College Graduates (NSCG) is a repeated cross-sectional survey conducted biennially since the 1990s that provides data on the nation’s college graduates, with a particular focus on those in the science and engineering (S&E) workforce. The survey samples individuals who are living in the United States during the survey reference week, have at least a bachelor’s degree, and are under the age of 76. This survey is a unique source for examining various characteristics of college-educated individuals, including occupation, work activities, salary, the relationship of degree field and occupation, and demographic information.
The 2017 NSCG includes over 83,000 respondents (70% unweighted response rate), representing a population of about 61 million college graduates living in the United States. Of these college graduates, an estimated 33 million are classified as scientists and engineers. These are individuals with a bachelor’s or higher-level degree educated or employed in a S&E or S&E-related field. Individuals not included in the survey frame for the 2017 NSCG are U.S. educated scientists and engineers earning degrees after 31 December 2015 and foreign-educated scientists and engineers who came to the United States after 31 December 2015.
NSCG classifies the following broad categories as S&E occupations: computer and mathematical scientists, life and related scientists, physical and related scientists, social and related scientists, and engineers. Postsecondary teachers are included within each of these groups. The following are considered S&E-related occupations: health and related occupations; S&E managers; S&E precollege teachers; S&E technicians and technologists, including computer programmers; and other S&E-related occupations, such as architects and actuaries. All other occupations are non- S&E occupations. Among the largest are non-S&E managers, non-S&E teachers, social services and related occupations, and sales and marketing occupations. Further information on NSCG can be found at https://www.nsf.gov/statistics/srvygrads/.
The Survey of Doctorate Recipients (SDR) is a repeated cross-sectional survey conducted biennially since 1973 that provides demographic and career history information about individuals with a research doctoral degree in a science, engineering, or health (SEH) field from a U.S. academic institution. The survey follows a sample of individuals with SEH doctorates throughout their careers from the year of their degree award until age 76. The panel is refreshed each survey cycle with a sample of new SEH doctoral degree earners. Results are used to make decisions related to the educational and occupational achievements and career movement of the nation’s doctoral scientists and engineers.
For the 2017 SDR, all 2015 sample members who remained age eligible for the survey were retained, and a sample of new graduates who had earned their degrees from 1 July 2013 to 30 June 2015 were added. The resulting 2017 SDR sample of 124,580 cases consisted of 113,814 age-eligible cases from the 2015 SDR and 10,766 cases from the new cohort of graduates from academic years 2014 and 2015. These 124,580 sample cases represent the 1,103,210 U.S. trained research doctorate recipients under 76 years of age. Field of study reporting for the 2017 SDR was revised and updated to better align with the new NCSES Taxonomy of Disciplines (ToD) which more closely aligns with the Classification of Instructional Programs (2010) issued by the National Center for Education Statistics.
Further information on the SDR is available at https://www.nsf.gov/statistics/srvydoctoratework/.
The following non-NSF sources were used for data tables in this report.
National Center for Education Statistics, Department of Education
The Integrated Postsecondary Education Data System (IPEDS) is a collection of survey programs that surveys all postsecondary institutions, including universities and colleges and the institutions that offer technical and vocational education. Starting in 1992, the completion of all IPEDS surveys is mandatory for all institutions that participate in or are applicants for participation in any federal financial assistance program authorized by Title IV of the Higher Education Act of 1965, as amended. IPEDS comprises several integrated component surveys. These surveys obtain information about types of institutions where postsecondary education is available, student participants, fall enrollments, programs offered and completed, graduation rates, and the human and financial resources involved in the delivery of postsecondary education. In this report, data are primarily drawn from the IPEDS Fall Enrollment Survey and the IPEDS Completions Survey, which is administered to all institutions offering degrees at the bachelor’s degree level and above, 2-year institutions, and less-than-2-year institutions.
NCES changed degree-level categories in the IPEDS Completions Survey in fall 2008, but reporting in the new categories was optional for 2008 and 2009 data. Reporting in the new degree-level categories was mandatory for the 2010–11 IPEDS Completions collection. Before 2008, the post-baccalaureate degree categories were master’s, first professional, and doctor’s. With the 2008 changes, the category first professional degree is no longer used. Programs and awards in that category (e.g., medicine, law, pharmacy, and theology) are now reclassified as either master’s degrees or as one of three types of doctor’s degrees: doctor’s-research/scholarship, doctor’s-professional practice, or doctor’s-other. Numbers reported here for 2008 and 2009 doctoral degrees combine doctor’s degrees reported by institutions using the pre-2008 reporting categories and doctor’s-research/scholarship degrees reported by institutions using the 2008 reporting categories. Data for 2010 include only doctorates reported as doctor’s-research/scholarship.
The IPEDS Completions data includes Title IV institutions in the 50 states, the District of Columbia, and other U.S. jurisdictions. The IPEDS Completions data covers all awards granted between 1 July and 30 June. The year of the data indicates the end of the academic year in which the degrees were awarded. This report also uses the Fall Enrollments Component of the IPEDS Enrollment data, which provides a snapshot of the enrollment at an institution for a specific time in the fall. For the IPEDS Fall Enrollment Survey, institutions with traditional academic year calendar systems (semester, quarter, trimester, or 4-1-4) report their enrollment as of 15 October or the official fall reporting date of the institution. Institutions with calendar systems that differ by program or allow continuous enrollment report students that are enrolled at any time between 1 August and 31 October. Enrollment numbers reported are as of that date in the indicated year.
Further information on the IPEDS is available at https://nces.ed.gov/ipeds.
National Center for Education Statistics, Department of Education
The National Postsecondary Student Aid Study (NPSAS) was established by NCES to collect information about financial aid allocated to students enrolled in U.S. postsecondary institutions. NPSAS was first administered in the fall of the 1986-87 academic year. NCES conducted subsequent cycles of NPSAS during the academic years of 1989–90, 1992–93, 1995–96, 1999–2000, 2003–04, 2007–08, 2011–12, and 2015–16.
The 2015–16 NPSAS (NPSAS:16) is a nationally representative sample survey of undergraduate and graduate students enrolled any time between 1 July 2015 and 30 June 2016 in institutions eligible to participate in federal financial aid programs. The NPSAS:16 sample consisted of about 2,000 institutions, 89,000 undergraduate students, and 24,000 graduate students attending Title IV postsecondary institutions in the 50 states, the District of Columbia, and Puerto Rico. The sample represented approximately 6,900 institutions, 20 million undergraduate students, and 4 million graduate students enrolled in postsecondary education at any time during the survey period. The response rate at the institutional level was 88% and at the student level was 94%.
Further information on the NPSAS is available at https://nces.ed.gov/npsas.
Bureau of Labor Statistics, Department of Labor
The Current Population Survey (CPS) is a monthly household survey conducted by the Census Bureau for the Bureau of Labor Statistics. It provides data on employment and unemployment by age, sex, race, and a variety of other characteristics, and it is the source of the monthly official U.S. unemployment rate. Estimates calculated from the CPS reflect the civilian noninstitutional population age 16 and older. CPS gathers information from approximately 60,000 households monthly through personal and telephone interviews. Basic labor-force data are gathered monthly; data on special topics are gathered in periodic supplements. Consecutive monthly estimates are often averaged to produce quarterly or annual average estimates. Monthly response rates are generally around 85%.
Further information on the CPS is available at https://www.bls.gov/cps/.
Office of Personnel Management
The Office of Personnel Management (OPM) provides estimates of federally employed scientists and engineers through its Enterprise Human Resources Integration Statistical Data Mart (EHRISDM). The data cover most executive branch agencies and some legislative and judicial branch agencies. Coverage is limited to federal employees with at least a bachelor’s degree and is subject to change over time. For example, the State Department stopped providing data on Foreign Service personnel in 2006 and stopped providing all data in 2015. In 2017, the question on disability was changed, and the new wording resulted in an increase in the number of federal employees with disabilities as measured.
More information on OPM’s estimates of federally employed scientists and engineers is available at https://www.opm.gov/policy-data-oversight/data-analysis-documentation/.
The American Community Survey is a monthly survey that produces annual estimates of the U.S. population as well as various demographic, labor force, and housing characteristics. The survey was designed to allow for estimates of small geographical areas that previously were only possible using decennial census data. The 2016 ACS sample size was 3.5 million addresses, resulting in 2.2 million household interviews.
Further information on the ACS is available at https://www.census.gov/programs-surveys/acs/.
The data from all the sources used for this report are subject to error. For non-census survey programs (NSCG, SDR, NPSAS, CPS, ACS), accuracy is determined by the joint effects of sampling and nonsampling errors. Sampling errors arise because estimates based on a sample differ from figures that would have been obtained if a complete population had been surveyed. The sample selected for any survey is only one of a large number of possible samples of the same size and design that could have been selected. Even if all other aspects of the survey remained fixed, such as the questionnaire and instructions, the estimates from each sample would differ. This variability, termed sampling error, occurs by chance and is measured by the standard error associated with a particular estimate.
The standard error of a sample survey estimate measures the precision with which an estimate from one sample approximates the true population value, and it can be used to construct a confidence interval for a survey parameter to assess the accuracy of the estimate. For further information on sampling error sources and its impact on the survey estimates, see each survey’s website.
Nonsampling errors can arise from design, reporting, and processing errors, as well as from errors due to nonresponses or faulty responses. These errors can occur in data from sample surveys, from census surveys (SED, GSS, IPEDS), and from administrative data (OPM). Nonsampling errors include respondent-based events, such as some respondents interpreting questions differently from other respondents, respondents making estimates rather than giving actual data, and respondents being unable or unwilling to provide complete, correct information. Errors can also arise during the processing of responses, such as during recording and keying. Nonsampling errors are difficult to measure, and estimates of nonsampling errors are not available for data in this report.