Provable Anonymization of Grid Data for Cyberattack Detection

Project Summary

Data is frequently not shared by organizations because that data is considered by the organization to be in some way sensitive. For example, there may be laws or regulations prohibiting sharing due to personal privacy or national security issues, or the organization owning the data may also consider that data to be a proprietary trade secret. In any case, that data cannot or will not be released in raw form, and so alternative approaches are needed if that data is to be shared at all.

Today, data is often not shared at all, or if it is shared, it is done so in ways that require people processing or analyzing that data to access the data in highly secured, non-networked environments set up to prevent any data from being exfiltrated either physically from a building or certainly from a network. This is the reason why much research is hindered. Sometimes data is shared through processes of “anonymization” in which data is typically either masked or made more general. Unfortunately, these techniques have repeatedly been shown to fail, typically by merging external information containing identifiable information with quasi-identifiers contained in the dataset in order to identify “anonymized” records in the dataset.

This project aims to develop techniques for enabling data analysis for the purposes of detecting and/or investigating cyberattacks against energy delivery systems while also preserving aspects of key confidentiality elements within the underlying raw data being analyzed. Specifically, this project proposes to examine the application of privacy-preserving techniques to OT and grid-security-relevant IT data provided by the California Energy Commission (CEC), Kevala, and Portland General Electric, in order to protect privacy as much as possible, thereby minimizing the amount of data for which “traditional” (and vulnerable) anonymization techniques need to be applied. The result will be a solution for anonymization of data collected from OT and IT networks pertaining to energy grid cyberattack detection that has been tested for its ability to retain privacy properties and still enable attack detection.

This project is supported by the U.S. Department of Energy’s Cybersecurity for Energy Delivery Systems (CEDS) program.

DOE Press Release: “Department of Energy Announces Awardees of $30 Million Research Call to Enhance Cybersecurity for Energy Delivery Systems,” August 27, 2019.

Principal Investigator:

Sean Peisert (PI; LBNL)

Co-Leads:

Anna Scaglione (Lead at Cornell Tech)
Aram Shumavon (Lead at Kevala)

Postdocs:

Tong Wu (Cornell Tech)

Graduate Students:

Andrew Campbell
Nikhil Ravi (Cornell Tech)
Leah Woldemariam

Partners:

Cornell Tech
Kevala, Inc.
California Energy Commission
Portland General Electric
SunPower

Past Researchers:

Rojin Zandi (Cornell Tech)
Sachin Kadam (ASU)
Daniel Arnold (LBNL)
Reinhard Gentz (LBNL)
Raksha Ramakrishnan (KTH)
Ciaran Roberts (LBNL)

Past Partners:

Arizona State University

Press regarding this project:

Scientific Data Division Summer Students Tackle Data Privacy - Sept. 15, 2022

Presentations relating to this project:

Sean Peisert, “Cybersecurity and Privacy R&D for Science and Energy,” Networking and Information Technology Research and Development (NITRD) Cyber Security and Information Assurance (CSIA) Interagency Working Group (IWG) Meeting, March 24, 2022.

Publications resulting from this project:

Tong Wu, Anna Scaglione, Adrian Petru Surani, Daniel Arnold, and Sean Peisert, “Network-Constrained Reinforcement Learning for Optimal EV Charging Control,” Proceedings of the IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), October 31–November 3, 2023.

Robert Currie, Sean Peisert, Anna Scaglione, Aram Shumavon, and Nikhil Ravi, “Data Privacy for the Grid: Toward a Data Privacy Standard for Inverter-Based and Distributed Energy Resources,” IEEE Power & Energy Magazine, 21(5), pp. 58-57, Sept.-Oct 2023.

Sachin Kadam, Anna Scaglione, Nikhil Ravi, Sean Peisert, Brent Lunghino, and Aram Shumavon, “Optimum Noise Mechanism for Differentially Private Queries in Discrete Finite Sets,” Proceedings of the 2023 IEEE International Conference on Smart Applications, Communications and Networking (SmartNets), Istanbul, Turkey, July 25–27, 2023.

Raksha Ramakrishna, Anna Scaglione, Tong Wu, Nikhil Ravi, and Sean Peisert, “Differential Privacy for Class-based Data: A Practical Gaussian Mechanism,” to appear in IEEE Transactions on Information Forensics and Security, 2023.

Nikhil Ravi, Anna Scaglione, Julieta Giraldez, Parth Pradhan, Chuck Moran, and Sean Peisert, “Solar Photovoltaic Systems Metadata Inference and Differentially Private Publication,” arXiv preprint arXiv:2304.03749 7 Apr 2023.

Nikhil Ravi, Anna Scaglione, Sachin Kadam, Reinhard Gentz, Sean Peisert, Brent Lunghino, Emmanuel Levijarvi, and Aram Shumavon, “Differentially Private K-means Clustering Applied to Meter Data Analysis and Synthesis,” IEEE Transactions on Smart Grid, June 17, 2022.

Anna Scaglione, “The Use of Differential Privacy for Energy Data,” Proceedings of the 8th ACM on Cyber-Physical System Security Workshop (CPSS ‘22), May 30-June 2, 2022. https://doi.org/10.1145/3494107.3522780

Nikhil Ravi, Anna Scaglione, Sachin Kadam, Reinhard Gentz, Sean Peisert, Brent Lunghino, Emmanuel Levijarvi, Aram Shumavon, “Differentially Private K-means Clustering Applied to Meter Data Analysis and Synthesis,” arXiv preprint arXiv:2112.03801, 7 Dec 2021.

Sachin Kadam, Anna Scaglione, Nikhil Ravi, Sean Peisert, Brent Lunghino, and Aram Shumavon, “Optimum Noise Mechanism for Differentially Private Queries in Discrete Finite Sets,” arXiv preprint arXiv:2111.11661, 23 Nov 2021.

Nikhil Ravi, Anna Scaglione, and Sean Peisert, “Colored Noise Mechanism for Differentially Private Clustering,” arXiv preprint arXiv:2111.07850, 15 Nov 2021.

More information is available on other Berkeley Lab R&D projects focusing on cybersecurity in general, as well as specifically on cybersecurity for energy delivery systems.