Overview: Big Data

What is big data and how to deal with it

The term “big data” is often defined with reference to the characteristics of the volume, variety and velocity of data (the “three V’s”).

The characteristic of “volume” refers to the quantity or magnitude of data. “Variety” refers to the range of different types of data. “Velocity” refers to the speed at which data is generated, processed or analysed.

Other features of big data that have been identified are veracity, variability and complexity. The characteristic of “veracity” refers to the unreliability or imprecision of certain data. “Variability” refers to variability in the rate or velocity of data flow. “Complexity” refers to the multiple sources from which data may be generated.

Big data is created, transferred, stored, hosted, used and processed daily in virtually all industry and community sectors, including by government, the private sector and other community organisations.

Certain types of data may require specific consents, approvals, licences, or agreements in order to permit an organisation to perform certain acts in relation to such data.

Big data may consist of personal information, data protected by copyright, data protected by moral rights, confidential information, and government official information.

Where an organisation enters into a commercial agreement with another organisation that involves the capture, receipt, transfer, hosting, processing or management of big data, there are a number of issues to consider, including which party is responsible for obtaining necessary consents, what contractual obligations will apply in relation to the big data, and technical, functional and performance specifications in relation to dealing with big data.

How to collect and use big data

Big data can be collected by an organisation from a large number of sources.

An organisation may seek to leverage big data using data analytics for a range of purposes.

Big data often constitutes personal information if any individuals to whom it relates are identifiable from the big data.

Data analytics often involves the collection and analysis by an organisation of big data that comprises the personal information of various individuals.

As it may not be possible or viable for an organisation to obtain the required consents from each individual whose personal information is being collected or analysed, the organisation may decide to utilise data anonymisation (or personal information “de-identification”) methods or techniques, to enable the organisation to collect, analyse and process big data that comprises personal information without breaching the Privacy Act 1988 (Cth) (Privacy Act) or confidentiality.

Preventative measures

Managing big data in a secure manner requires an organisation to implement an effective organisational data security compliance framework.

Such a framework should usually include:

  • regular audits of the organisation’s IT security policies, systems, controls, processes and practices;
  • effective IT security policies, systems, controls, processes and practices;
  • staff training and awareness of data security obligations;
  • a positive and strong compliance culture; and
  • ongoing governance oversight.

An organisation that stores or uses big data should also ensure that its data security compliance framework is consistent with the legal and contractual obligations the organisation has to other parties with respect to how it stores and uses the big data.

Organisations that store and use big data should develop and maintain effective IT security policies, systems, controls, processes and practices to prevent or minimise the risk of breach of data security obligations.

An organisation that stores or uses big data should conduct regular audits of its IT security systems, processes, practices and policies.

An organisation’s employees and contractors should receive regular training on compliance with data security requirements.

Organisations that store or use big data should develop and maintain a positive and strong compliance culture in relation to data security obligations.

Organisations that store or use big data should also implement effective internal governance processes and oversight of data security issues.