Data collection standards - General data items

This section discusses the importance of collecting general data items.

This section discusses the importance of collecting general data items, and presents existing national standards which should be used to ensure consistent and high quality collection of data by Victorian government departments, agencies and service providers.

Individual and demographic data items

Collecting information from people who come into contact with agencies and service providers is often a balancing act between respecting the privacy of individuals, gaining enough information to perform core business functions, making informed decisions about service use, and providing a deeper understanding about the experiences of people affected by family violence. Individual and demographic information is particularly valuable for creating a demographic profile of those affected by family violence, and is therefore essential in improving the family violence evidence base in Victoria. This section highlights data items which can be used to improve the quality and consistency of individual and demographic information on people who come into contact with an agency or service.

Names

Details about a person’s name are commonly collected by agencies and service providers. This information can be used by organisations as the primary identifier of a person’s identity, and is often necessary when seeking to count unique individuals who come into contact with an agency or service. Names are also an important component for data linkage projects, which are increasingly becoming a priority across government departments in Australia, enabling improved understanding of client pathways across agencies and services. It is therefore important that name information is collected consistently and accurately. The Commonwealth Attorney-General’s Department recognised the need to capture accurate and detailed information on people’s names, and published better practice guidelines for the collection of identity data in 2011.11 While these guidelines are targeted at Commonwealth agencies, the document sets out several principles that are valuable for all agencies and services that collect name information for the purpose of establishing identity. Some of the key principles in these guidelines are outlined below.

Information should be collected whenever possible. While there are some contexts where collection of a person’s name may not be appropriate, including where services are provided anonymously, information should be collected about a person’s first and last name whenever a person consents to provide it.

Information should reflect what is written on an official ID document. In order to ensure that information collected on names is done so uniformly and accurately, given names and family names should be recorded as they are written on an official government identification document. Data collectors should avoid recording nicknames, initials or abbreviated names in the given and last name fields, and instead, these should be captured in separate data fields. Whenever possible, nonalphabetical features should be collected as they appear in a name on official identification (including hyphens and spaces), as this is a feature which can improve the uniqueness of collected names.

It is noted that asking people to produce an ID document to verify their name may present a number of operational challenges or may not be possible given the nature of some services (for example, telephone-based services). A requirement to provide an official ID document may be challenging or offend people from some priority communities. For the reasons discussed, it is important to collect name information as recorded on official ID documents. However, unless official identification is required to be sighted by a service, it is recommended that people are not asked to provide an ID document to verify their name.

Additional name information should be collected. People frequently use names other than the name recorded on their identification document. If a person has a nickname or different name they use, this information should be collected in addition to the name appearing on their identification document. Staff should use this name during all communications with that person.

Please note that when asking if someone uses a different name than what is recorded on their identification document, it may be considered offensive to ask what their ‘preferred’ name is. This is particularly true for some trans and gender diverse people, and as such it is not recommended that agencies use the word ‘preferred’ when asking about the name that a person uses. For more information, please see the LGBTI communities section of this framework, which provides specific advice regarding the collection of information from LGBTI people.

Examples of name questions:

What is your given name[s], as it is recorded on government issued ID?

What name do you use? [if different from given name/s]

What is your family name, as it is recorded on government issued ID?

Name usage type. To assist with differentiating between the types of names collected, agencies or service providers may elect to use a data item which indicates the type of name used. Name usage type (METeOR identifier: 453366) concerns the usage type of a person’s family name and/or given name and enables differentiation between each recorded name.12

Records should be updated where appropriate and historical records retained. It is acknowledged that people often change the name that is recorded on their official ID document for a variety of reasons. Name information should be kept up-to-date to reflect the current information recorded on a person’s identification documents. Wherever possible, historical records of a person’s name should be retained, as this provides greater potential to confirm a person’s identity.13

Age / Date of birth

Age is a widely collected data item across agencies and service providers, however the method by which it is collected varies. Some offices collect age as a discreet number (for example, ‘65’), a predefined group (for example, ‘60 and older’) or as a date of birth which is later used to derive a person’s age.

The standard recommended in this framework is to collect age in the form of an individual’s date of birth. Date of birth should be collected in accordance with the format described by METeOR identifier 287007, which is DDMMYYYY. Collecting date of birth has the advantage of providing more detailed and accurate information about a person’s age, and it can be used for other purposes, including identifying unique individuals and linking datasets. The ABS has noted that “collecting age in complete years can lead to an error where a respondent may round off or approximate their age”.14 Additionally, collecting age as a number fixes a client age to the year in which it was collected, and the information loses meaning if it is unclear when the age was recorded. This is particularly true when a client interacts with an organisation over a long period of time.

Wherever possible, an agency should ask for the date of birth that an individual uses on their official ID documents. As with name information, unless official identification is required to be sighted by a service, it is recommended that people are not asked to provide an ID document to verify their date of birth.

It is noted that there are instances where an individual may not know their birth date, or where they are only able to supply a numerical estimate of their age. The date of birth standard used in a number of national minimum datasets (METeOR identifier: 287007) suggests that if date of birth is not known or cannot be obtained, provision should be made to collect or estimate age. Collected or estimated age would usually be in years for people 2 years and older, and to the nearest 3 months (or less) for children aged less than 2 years old. An estimated date of birth may also be relevant for unborn children. A child believed to be aged 18 months in January 2018 would therefore have an estimated birthdate of 01062016.15

When a recorded date of birth has been estimated, it is recommended that an indicator is used to denote that the date is an estimate. Information regarding best practice for indicating estimated dates is discussed later in this section.

Example of age question:

What is your date of birth [as it is recorded on your birth certificate or other form of identification]? DDMMYYYY

Date variables

Most organisations who collect data will record some type of date information. This can include the date when a service was initiated, attended and completed, when family violence was disclosed, when a family violence incident occurred, or, as previously noted, date of birth. To ensure that high quality date information is collected, it is recommended that dates are completed in full whenever possible, and contain an accurate record of the day, month and year.

In Australia, the date format most commonly used is DDMMYYYY. It is recommended that information is collected and stored in this format.

Finally, it is recommended that organisations collect a ‘create date’ or similar data item. This date represents the date an electronic record is created to be stored in an administrative data collection system. It is strongly recommended that the create date is auto-generated at the time that a case or record is first created. If records are promptly entered into a records management system, this approach ensures that the date is consistently and accurately collected.

Estimated dates. When a date is estimated, agencies and service providers are encouraged to use a data item which indicates that the date is an estimate. When estimating dates, Date – accuracy indicator (METeOR identifier: 294429) can be utilised to indicate which parts of the recorded date are known, estimated and unknown. The table below depicts how three codes are used in combination to provide information about the accuracy of a recorded date. For a full list of response options, please see the METeOR website.16

Date component (for a format DDMMYYYY)
Data domain Day Month Year
Accurate A A A
Estimated E E E
Unknown U U U

Gender identity and sex

Most agencies and service providers will collect information on a person’s sex or gender identity. Due to the overlap between these data items and information concerning LGBTI communities, the standard suggested for collecting this information is described in detail in the LGBTI communities section of this framework.

Geographic variables

Agencies and service providers may collect a range of geographic information surrounding family violence, depending on their operational purpose and the services that they provide. Examples of geographic data collected include the home address of a victim-survivor or perpetrator (person address) and the address of where a service is provided (organisation address).

Many agencies and service providers will collect geographic information as aggregated regions, for example, Local Government Areas (LGAs) or Department of Health and Human Services (DHHS) service regions. While this information may satisfy internal demands, it can have limited value for research purposes or when looking to compare data with population-based statistics. Additionally, geographic areas such as LGAs and other administrative service regions change over time.

Collecting detailed geographic information allows data to be mapped to higher level geographic structures as needed, such as those that exist in the ABS Australian Statistical Geography Standard (ASGS) 2016.17 The ‘Address Details Data Dictionary’ (METeOR identifier: 434713) created by the AIHW sets out address information which is important to collect for these purposes. As recommended by the data dictionary, when collecting information on addresses agencies and service providers should collect primary address information. This includes:18

  • address site (or primary complex) name
  • address number or number range
  • road name (name/type/suffix)
  • locality
  • state/territory
  • postcode
  • country (if other than Australia)

In 2010, DHHS released their ‘Address reference data dictionary (version 1.1)’,19 which provides a common set of concepts, data elements and edit/validation rules that define the basis of address data collection. If interested in collecting data elements outside of the primary address information listed above, this data dictionary draws on existing national standards to assist data collection custodians to better document and manage address data. It is therefore a valuable resource for agencies and service providers to consider when seeking to align their address data items with national standards.

Unique identifiers and data linkage considerations

Unique identifiers

Unique identifiers (UIDs) can be used in a variety of contexts, but typically their purpose is to identify a unique item, person or case file. Often UIDs will consist of a distinct combination of letters and numbers that are randomly assigned and auto-generated by a records management system, however they may also be borrowed from other forms of identification codes. For example, some services may use a person’s driver’s license or passport number as their UID, or they may adopt a UID given by a different system (for example, courts may use the person identification number created by Victoria Police’s Law Enforcement Assistance Program (LEAP) system).

Examples of unique identifiers:

Case file number: C14-12-002 Client number: Z103903

Where UIDs are used, it is recommended that agencies and service providers ensure that the numbers given to clients are always unique, and that there is a protocol in place to prevent UIDs or clients from being duplicated.

The Office of the Victorian Information Commissioner has created an Information Sheet concerning the use of UIDs and relevant considerations under the Privacy and Data Protection Act 2014 (Vic).20 Organisations should review their obligations under relevant legislation in Victoria when considering implementing the use of UIDs for identifying individuals.

Statistical Linkage Key (SLK): As described by the AIHW in METeOR identifier 686241, an SLK is a key that enables two or more records belonging to the same individual to be brought together. It is a 14 character string represented by a code consisting of the second, third and fifth alphanumeric characters of a person’s family name, the second and third alphanumeric characters of a person’s given name, the day, month and year when the person was born (in the format DDMMYYYY) and a single alphanumeric character representing the sex of a person, concatenated in that order:21

XXXXXDDMMYYYYX

SLKs are valuable for collection because they not only serve as unique identifiers which assist with counting unique people accessing a service, but as they are uniformly created, they can be used to link individuals across internal and external data sets.

Data linkage

The potential to link datasets between agencies and service providers is becoming a matter of priority in Australia. Recommendation 204 of the RCFV specifically mentions the need to explore “opportunities for data linkage between existing data sets...to increase the relevance and accessibility of existing data”.22 Data linkage can also provide greater insight into perpetrator and victim-survivor engagement with services, and an opportunity to view an individual’s trajectory through the criminal justice and community services systems.

The most significant data items which assist with data linkage are those that can be used to denote unique individuals, cases, times and locations. Often the most desirable information which can be used for this purpose are high quality UIDs. Specific details concerning UIDs are described earlier in this section, however it is worth noting that for linkage purposes, the collection of external UIDs in combination with internally created UIDs is ideal. For example, a service may receive an L17 Risk Assessment and Management form from Victoria Police as part of an application or a referral for service, and where possible, agencies should collect the incident number that is recorded on this form, in addition to creating their own file ID and client ID. Retaining the incident number assigned by Victoria Police will allow for data collected by a service to be easily and reliably linked back to Victoria Police data if required.

Other data items can be used for linkage purposes if they are collected in a consistent and accurate manner. These items include first and last name, date of birth and sex or gender information. Best practice methods for collecting this information are described earlier in this section.


11 Department of Home Affairs 2011, Improving the Integrity of Identity Data: Recording of a Name to Establish Identity 2011,
Attorney-General’s Department, viewed 12 June 2018, www.homeaffairs.gov.au/about/crime/identity-security/guidelines-andstandards
12 AIHW Metadata Online Registry, Person – name usage type, code AAA, viewed 20 June 2018,
http://meteor.aihw.gov.au/content/index.phtml/itemId/453366
13 Department of Home Affairs 2011, Improving the Integrity of Identity Data: Recording of a Name to Establish Identity 2011,
Attorney-General’s Department, viewed 12 June 2018, www.homeaffairs.gov.au/about/crime/identity-security/guidelines-andstandards
14 ABS 2014, 1200.0.55.006 Age Standard Version 1.7, viewed 20 June 2018, www.abs.gov.au/ausstats/abs@.nsf/Lookup/1200.0.55.006main+features72014,%20Version%201.7
15 AIHW Metadata Online Registry, Person – date of birth, DDMMYYYY, viewed 12 June 2018,
http://meteor.aihw.gov.au/content/index.phtml/itemld/287007
16 AIHW Metadata Online Registry, Date – accuracy indicator, code AAA, viewed 12 June 2018,
http://meteor.aihw.gov.au/content/index.phtml/itemld/294429
17 ABS 2016, 1270.0.55.001 Australian Statistical Geography Standard (ASGS): Volume 1 – Main Structure and Greater Capital City Statistical Areas, viewed 20 June 2018, www.abs.gov.au/ausstats/abs@.nsf/mf/1270.0.55.001
18 AIHW Metadata Online Registry, Address details data dictionary, viewed 20 June 2018,
http://meteor.aihw.gov.au/content/index.phtml/itemId/434713
19 DHHS 2010, Address reference data dictionary version 1.1, viewed 20 June 2018, www2.health.vic.gov.au/about/publications/policiesandguidelines/data-dictionary-address-reference
20 Commissioner for Privacy and Data Protection 2017, Information sheet: ‘Unique Identifier’ under the Privacy and Data Protection Act 2014 (Vic), viewed 12 June 2018, www.cpdp.vic.gov.au/menu-resources/resources-privacy/resources-privacy-information-sheets
21 AIHW Metadata Online Registry, Record – linkage key, code 581 XXXXXDDMMYYYYX, viewed 13 June 2018,
http://meteor.aihw.gov.au/content/index.phtml/itemld/686241
22 RCFV 2016, Summary and recommendations, Victoria, p.101.

Updated