There was once a time when the mantra regarding corporate data was to keep everything, just in case.
But as recent data breaches have shown, this otherwise benign asset can quickly become a liability when it falls into the hands of cyber criminals.
For data professionals, this transition in thinking has seen their remit expand to now considering how data is secured – including consideration of the various regulatory regimes that govern the management and securing of personal data.
All of this means the key question for many data professionals is how they can apply appropriate layers of security to their data without unduly impeding the reasons why they gathered it in the first place.
The complexity of securing data means that for many organisations it is the security professionals such as Mark Knowles who are taking a lead role in this task.
As the general manager of security assurance at the business software maker Xero, Knowles has become adept at utilising the world's regulatory regimes as a useful benchmark for building a culture of security compliance.
"We always look to the most challenging standards around the world and work to those," Knowles said.
"GDPR is probably the toughest privacy standard, so we talk to our partners in data and work through that with them."
When it comes to communicating these standards to the teams that are tasked with building data products for Xero, Knowles said the organisation has adopted an approach that shifts conversations about data security to the 'left hand side' of the development pipeline. This ensures data security is an early consideration in the development process, in line with his own preference for the principles of the DevSecOps development methodology.
"It is a challenge for organisations to have a true DevSecOps model, because developers and engineers are tasked with building a product or service and are given a deadline to build it as quickly as they can." - Mark Knowles, GM security assurance, Xero
"What we are proactively doing is trying to educate all of those people, so they are thinking about making sure things are securely but done quickly.
"We are looking at the tools that do look 'left', and there are some that we engage with already that enable us to see if there is rogue code. And this is where artificial intelligence comes into it as well, because AI is a good way of identifying whether there are tools that have bad code in them."
While this approach embeds security into the systems that use sensitive data, other methods are needed to ensure that data is secured in all contexts.
One of the traditional tools for data security is encryption, which has given rise to a market that Grand View Research estimates was valued at US$13.5 billion ($20.36 billion) in 2022 and growing at 16.5 percent each year.
However, this tried and trusted technique faces a very real challenge in the approaching advent of 'Q Day', which describes the day when quantum computing becomes powerful enough to crack traditional encryption systems.
This in turn has fuelled interest in post-quantum cryptography (also called quantum-safe cryptography). But while these solutions will be beneficial for securing the data that organisations hold today, they will do little to protect the vast troves of non-quantum-safe encrypted data that some experts believe have already been exfiltrated by state actors, and which are lying in storage awaiting Q-Day's arrival.
Another approach advocates using data in such a way that its value to cyber criminals can be greatly reduced.
Data masking works by applying techniques that alter data such that it cannot be reverse engineered back to its previous values (such as replacing people's names with randomly generated words), but without altering its structure and utility. This allows the data to be manipulated and analysed without fear of exposure.
According to Mordor Intelligence, the data masking market size is estimated as worth almost US$1 billion ($1.51 billion) in 2024 and is expected to reach US$1.87 billion ($2.82 billion) by 2029.
A related concept is that of data virtualisation, which enables applications to retrieve and use data without requiring technical details about the data, allowing multiple data sources to effectively be brought together into one place but without the dangers associated with creating centralised data pools.
And yet another technique that is gaining favour is the use of synthetic data, which is created by generating data from an actual data set but without including identifiers. Because the synthetic data has the same properties as the real data, it provides a strong basis for analysis, but without the risk associated with private and personal data.
The popularity of synthetic data is such that Gartner has reported that by 2024, 60 percent of data for AI will be synthetic to simulate reality, future scenarios and derisk AI, up from one percent in 2021.
These techniques have gained favour at Tabcorp, where general manager for technology Matt McKenzie has been working with his security colleagues to find ways they can use data that protects the people it describes and satisfies the needs of regulators.
"Synthetic data in non-production environments is a real must-do for us, and done in a way that most accurately reflects the production environment," McKenzie said.
"As we progress things through ideation, discovery, and delivery, and into production, the outcomes we are seeking align with what we originally started with."
Ironically, while once the greatest threat to data was its destruction, in some instances data professionals are now advocating for exactly that outcome.
Erasure is the natural end point for the data lifecycle once data has outlived its usefulness, with Facts and Factors reporting the global secure data destruction market size as valued around US$6.5 billion ($9.8 billion) in 2022 and expected to grow to US$12.5 billion ($18.85 billion) by 2030.
At Tabcorp, McKenzie said data retention has been a critical consideration, with he and his team systematically working to ensure they are storing the minimum amount of data they need for the minimum time, whether it be for regulatory reasons, legal obligations, or simply for the good use of data for current customers.
"We have been really focused on going through and making sure that our hygiene and our ongoing upkeep of data retention is really top notch," McKenzie said.
"You are not going to make any more money by ensuring your data retention policies and strategies are up to date, but it is just good governance. We want our customers and the market to trust us, and that trust starts with us doing that proactively and not reactively." - Matt McKenzie, GM technology, Tabcorp
However, this last point can serve to make data destruction difficult to achieve from an economic perspective, as the action serves no productive purpose and must be paid for from operational expenditure.
All of these strategies will only prove useful however if they are adhered to throughout the entire organisation. At Xero, Knowles says communication is the key to good data security.
"We actually try and educate the whole company so that security is something that we live and breathe, and that goes for data as well, so that people think about data privacy and data security as something they do as a natural thing," Knowles said.
We are proud to present the State of Data champions, and showcase the work they do.