Skip to content Learn about the access keys available for Aristotle.Cloud

Definition

Statistical data linkage refers to the bringing together of data from different sources to gain a greater understanding of a situation or individual from the combined (or linked) dataset. This facilitates a better understanding of the patterns of service use by groups of clients for research, statistical or policy analysis, planning and evaluation purposes.

Its form is: XXXXXDDMMYYYYN

The sequence in which the linkage key is completed is as follows:

Family name (the first 3 Xs)

Given name (the 4th and 5th X)

Date of birth by day, month and four-digit year

Sex

XXX 2nd, 3rd and 5th letters of the family name.

In the first three spaces the agency should record the 2nd, 3rd and 5th letters of the client’s family name.

For example: If the client’s family name is Smith the reported value should be MIH. If the client’s family name is Jones the reported value should be ONS.

Regardless of the length of a person’s name, the reported value should always be three characters long. If the legal family name is not long enough to supply the requested letters (i.e. a legal family name of less than five letters) then agencies should substitute the number ‘2’ to reflect the missing letters. The placement of a number ‘2’ should always correspond to the same space that the missing letter would have within the 3-digit field. A number (rather than a letter) is used for such a substitution in order to clearly indicate that an appropriate corresponding letter from the person’s name is not available.

Cases where the family name has less than 5 letters:

If a person’s family name is Farr, then value reported would be AR2 because the 2 is substituting for a missing 5th letter of the family name. Similarly, if the person’s family name was Hua, then the value reported would be UA2 because the 2 is substituting for the missing 5th letter of the family name.

If a client’s family name is missing altogether the agency should record the number 999 for all three spaces associated with the family name, (not the number 2). In some cultures it is traditional to state the family name first. To overcome discrepancies in recording/reporting that may arise as a result of this practice, agencies should always ask the person to specify their legal first given name and their legal family name separately. These should then be recorded as first given name and family name as appropriate, regardless of the order in which they may be traditionally given.

If the client’s family name includes non-alphabetic characters—for example hyphens (as in Lee-Archer), apostrophes (as in O’Mara) or blank spaces (as in De Vries)—these non-alphabetic characters should be ignored when counting the position of each character.

XX 2nd and 3rd letters of given name

In the fourth and fifth spaces the agency should record the 2nd and 3rd letters of the client’s given name.

For example: If the client’s given name is Elizabeth the reported value should be LI. If the client’s given name is Robert the reported value should be OB.

If the client’s given name includes non-alphabetic characters—for example hyphens (as in Jo-Anne) or apostrophes (as in D'Arcy), these non-alphabetic characters should be ignored when counting the position of each character.

Regardless of the length of a person’s given name, the reported value should always be two characters long. If the given name of the person is not long enough to supply the requested letters (i.e. a name of less than three letters) then agencies should substitute the number ‘2’ to reflect the missing letters. The placement of a number ‘2’ should always correspond to the same space that the missing letter would have within the 2-digit field. A number (rather than a letter) is used for such substitutions in order to clearly indicate that an appropriate corresponding letter from the person’s name is not available.

For example: If the person’s legal name was Jo then the value reported would be O2 because the 2 is substituting for the missing 3rd letter of the given name.

If the person’s given name is missing altogether the agency should record 99 for the two spaces associated with the given name. In some cultures it is traditional to state the family name first. To overcome discrepancies in recording/reporting that may arise as a result of this practice, agencies should always ask the person to specify their given name and their family name separately. These should then be recorded as first given name and family name as appropriate, regardless of the order in which they may be traditionally given.

Date of Birth

DD represents the day in the month a person was born

MM represents the month in the year a person was born

YYYY represents the year a person was born

If date of birth is not known or cannot be obtained, provision should be made to collect or estimate age. Collected or estimated age would usually be in years for adults and to the nearest three months (or less) for children aged less than two years. Additionally, an estimated date flag or a date accuracy indicator should be reported in conjunction with all estimated dates of birth.

For data collections concerned with children's services, it is suggested that the estimated date of birth of children aged under 2 years should be reported to the nearest 3 month period, i.e. 0101, 0104, 0107, 0110 of the estimated year of birth. For example, a child who is thought to be aged 18 months in October of one year would have his/her estimated date of birth reported as 0104 of the previous year. Again, an estimated date flag or date accuracy indicator http://meteor.aihw.gov.au/content/index.phtml/
itemId/294429
should be reported in conjunction with all estimated dates of birth.

Sex

N represents whether or not the person is a 1. Male or 2. Female.

Operationally, sex is the distinction between male and female, as reported by a person or as determined by an interviewer.

When collecting data on sex by personal interview, asking the sex of the respondent is usually unnecessary and may be inappropriate, or even offensive. It is usually a simple matter to infer the sex of the respondent through observation, or from other cues such as the relationship of the person(s) accompanying the respondent, or first name. The interviewer may ask whether persons not present at the interview are male or female.

A person's sex may change during their lifetime as a result of procedures known alternatively as sex change, gender reassignment, transsexual surgery, transgender reassignment or sexual reassignment. Throughout this process, which may be over a considerable period of time, the person's sex could be recorded as either Male or Female.

In data collections that use the ICD-10-AM classification, where sex change is the reason for admission, diagnoses should include the appropriate ICD-10-AM code(s) that clearly identify that the person is undergoing such a process. This code(s) would also be applicable after the person has completed such a process, if they have a procedure involving an organ(s) specific to their previous sex (e.g. where the patient has prostate or ovarian cancer).

Code 3 Intersex or indeterminate

Is normally used for babies for whom sex has not been determined for whatever reason.

Should not generally be used on data collection forms completed by the respondent.

Should only be used if the person or respondent volunteers that the person is intersex or where it otherwise becomes clear during the collection process that the individual is neither male nor female.

Code 9 Not stated/inadequately described

Is not to be used on primary collection forms. It is primarily for use in administrative collections when transferring data from data sets where the item has not been collected.

Data that has been produced by linkage for statistical and research purposes should not be used subsequently for client management purposes.

This data cluster contains a set of specific data elements to be reported on in a predetermined combination.

Metadata items in this Data Set Specification

Below is a list of all the components within this Dataset Specification.
Each entry includes the item name, whether the item is optional, mandatory or conditional and the maximum times the item can occur in a dataset.
If the items must occur in a particular order in the dataset, the sequence number is included before the item name.


Data Elements
Reference Data Element Data Type Length Inclusion #
Person—letters of family name, text XXX - - mandatory 1
Person—letters of given name, text XX - - mandatory 1
Person: Date of Birth, DDMMYYYY Date/Time 8 mandatory 1
Person: Sex, Code N Number 1 mandatory 1
Record—linkage key, code 581 XXXXXDDMMYYYYN - - mandatory 1
Date—accuracy indicator, code AAA
Conditional Inclusion:
Where a date of birth is estimated the date accuracy indicator should be used
- - conditional 1

Comments

Guide for use:
Where a date of birth is estimated the date accuracy indicator should be used. Please see Relational attributes.

Origin:
AIHW 1998. Home and Community Care (HACC) Data Dictionary Version 1.0. Report prepared for the Commonwealth and State/Territory government HACC Officials.

References

NCSIMG 2001. Statistical Data Linkage in Community Services Data Collections. Canberra: Australian Institute of Health and Welfare.
This content Based on Australian Institute of Health and Welfare material. Attribution provided as required under the AIHW CC-BY licence.

Related content

Relation Count
As a numerator in an Indicator 0
As a denominator in an Indicator 0
As a disaggregation in an Indicator 0