Data project

Samples of Anonymised Records (SARs)

Samples of Anonymised Records (SARs)

Summary

The SARs are designed to ensure that sample members cannot be identified. In order to achieve this confidentiality, the amount of detail available is restricted to a non-disclosive level and individual respondents only appear in one file. Although such measures are taken, the data still look like that which might be collected if you were to conduct a survey yourself, and can be analysed in the same way. The SARs hold the further advantage of much larger sample sizes than are typical in alternative survey data sources. For example, the 2001 Individual SAR contains 3% of UK census records, equating to 1.84 million cases. The largest file, the 2001 Small Area Microdata (SAM), is a 5% file containing nearly three million cases. The SARs files contain data from one census only (1991 or 2001 or 2011 in due course). This contrasts with other individual level (or microdata) census products such as the Office for National Statistics Longitudinal Study, which links individual data records over time. However, unlike the Longitudinal Study, most SARs files can be downloaded and used at your own place of work rather than requiring access from a safe setting. Like other microdata files, the SARs enable researchers to analyse data in a very flexible manner. This enables users to: • apply their own definitions and create new variables • define tables • work with sub-populations • conduct multivariate analyses Because the files are very large, they also permit analyses of relatively small sub-populations for which it is often difficult to obtain sufficient sample sizes in other survey data. Consequently, a major use of the SARs has been for the analysis of individual ethnic groups. Statistics are most useful when they are comparable across space and time and the SARs were designed with this in mind. However, there are some difference between the 2001 and 1991 SARs. Some of these differences reflect changes in society leading to different questions in the census, others reflect changes in confidentiality disclosure control.

Type of data

Data Source
Other, please specify

Type of Study
Crosssection occasional

Data gathering method
Self administered questionnaire

Access to data

Conditions of access
Data are available from the UK Data Service (previously the Economic and Social Data Service, ESDS): http://ukdataservice.ac.uk/ The website contains detailed information on conditions of access, and it is also possible to contact the UK Data Service by phone: +44 (0)1206 872143, or by email: help@ukdataservice.ac.uk

Type of available data (e.g. anonymised microdata, aggregated tables, etc.)
Anonymised microdata

Formats available
Data from the UK Data Service are usually available to download in SPSS, Stata and tab-delimited (suitable for use in MS Excel) formats.

Coverage

Coverage Years of collection, reference years, and sample sizes
The SARs are a family of datasets drawn from the 1991 and 2001 UK Census. The SARs contain a separate record for each individual, but identifying information has been removed to protect confidentiality. The SARs datasets are similar to data from a survey, albeit with a much larger sample size thus permitting analysis of small sub-groups and small geographic levels. The SARs cover the full range of Census topics including, housing, education, health, transport, employment, ethnicity and religion. For the Census 2001, the Individual Licenced Sample of Anonymised Records (I-SAR) had a sample size of 1,843,525 cases. `Note: Older people are represented in this data source (approximately) according to their proportion in the population. In 2001, approximately one third of the total UK population was aged 50 and over.`

First year of collection
1991

Stratification if applicable
The data includes a variety of demographic variables, including age and sex.

Base used for sampling

Geographical coverage and breakdowns
Countries (England, Wales, Scotland, Northern Ireland) Government Office Regions (NUTS1) More detailed spatial data are available under Special License.

Age range
All ages

Statistical representativeness
Population representative

Coverage of main and cross-cutting topics
The Samples of Anonymised Records (SARs) are a family of datasets which are currently available from the 1991 and 2001 Censuses and will be available in due course from the 2011 Census. Each file is a sample of individual person-level records drawn from the census database that has been anonymised. Each file contains a broad range of socio-demographic characteristics for respondents, with a particular emphasis on either individual, household, or geographical detail. The SARs allow flexible, multivariate analysis at the individual level. They are a unique data source for the investigation of a range of social issues including household composition, ethnic differences, education and employment. The SARs differ from traditional Census outputs in that they are not aggregated into pre-determined tables.

Linkage

Standardisation
There is an ongoing cross-governmental programme of work in the UK which aims to develop and improve standardised inputs and outputs for use in official statistics. This is known as harmonisation, and is led by the Office for National Statistics (ONS). While this work primarily affects government-run surveys, the results have an impact on most national UK data sources. Furthermore, harmonisation has important benefits for all researchers using these surveys, and not just government statisticians. For more information, see: http://www.ons.gov.uk/ons/guide-method/harmonisation/harmonisation-index-page/index.html

Possibility of linkage among databases
Data are anonymised.

Data quality

Entry errors if applicable
Summary information on data entry is not readily available, but the survey documentation contains the available information on data processing after the data was collected. For further information on data quality, contact ONS (Office for National Statistics), or review the documentation on the UK Data Service website.

Breaks
There are no major breaks for this data source. However, data are only every 10 years (because of the period between censuses).

Consistency of terminology or coding used during collection
The consistency of this data source is very good. Further information cannot be provided as that would entail discussing specific variables. For more information on data quality, contact ONS, or review the documentation on the UK Data Service website.

Governance

Contact information
Office for National Statistics
Customer Contact Centre
Government Buildings, Cardiff Road
NP10 8XG Newport, South Wales United Kingdom Phone: +44 (0) 1633 455678.
Email: info(at)statistics.gsi.gov.uk
Url: http://ukdataservice.ac.uk/

Timeliness, transparency
Timetables for the availability of census data vary by data type, and the SARs are not usually available until at least 24 months after each census.