Bolstering integrity in environmental data science and machine learning requires understanding socioecological inequity

Bozeman, Joe F.

doi:10.1007/s11783-024-1825-2

Bolstering integrity in environmental data science and machine learning requires understanding socioecological inequity

Perspectives
Published: 08 February 2024

Volume 18, article number 65, (2024)
Cite this article

Frontiers of Environmental Science & Engineering Aims and scope Submit manuscript

Joe F. Bozeman III^1,2

330 Accesses
3 Citations
213 Altmetric
28 Mentions
Explore all metrics

Abstract

Socioecological inequity in environmental data science—such as inequities deriving from data-driven approaches and machine learning (ML)—are current issues subject to debate and evolution. There is growing consensus around embedding equity throughout all research and design domains—from inception to administration, while also addressing procedural, distributive, and recognitional factors. Yet, practically doing so may seem onerous or daunting to some. The current perspective helps to alleviate these types of concerns by providing substantiation for the connection between environmental data science and socioecological inequity, using the Systemic Equity Framework, and provides the foundation for a paradigmatic shift toward normalizing the use of equity-centered approaches in environmental data science and ML settings. Bolstering the integrity of environmental data science and ML is just beginning from an equity-centered tool development and rigorous application standpoint. To this end, this perspective also provides relevant future directions and challenges by overviewing some meaningful tools and strategies—such as applying the Wells-Du Bois Protocol, employing fairness metrics, and systematically addressing irreproducibility; emerging needs and proposals—such as addressing data-proxy bias and supporting convergence research; and establishes a ten-step path forward. Afterall, the work that environmental scientists and engineers do ultimately affect the well-being of us all.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supporting Cross-Domain System-Level Environmental and Earth Science

Data Science in Environmental Health Research

Article 15 July 2019

Enhancing environmental decision-making: a systematic review of data analytics applications in monitoring and management

Article Open access 27 September 2024

References

Baker E, Carley S, Castellanos S, Nock D, Bozeman III J F, Konisky D, Monyei C G, Shah M, Sovacool B (2023). Metrics for decisionmaking in energy justice. Annual Review of Environment and Resources, 48(1): 737–760
Article Google Scholar
Balayn A M A, Lof C, Houben G J P M (2021). Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems. VLDB Journal, 30(5): 739–768
Article Google Scholar
Bozeman J F III, Nobler E, Nock D (2022). A path toward systemic equity in life cycle assessment and decision-making: standardizing sociodemographic data practices. Environmental Engineering Science, 39(9): 759–769
Article CAS PubMed PubMed Central Google Scholar
Bozeman III J F, Chopra S S, James P, Muhammad S, Cai H, Tong K, Carrasquillo M, Rickenbacker H, Nock D, Ashton W et al. (2023). Three research priorities for just and sustainable urban systems: now is the time to refocus. Journal of Industrial Ecology, 27(2): 382–394
Article Google Scholar
Chubb J, Reed M S (2018). The politics of research impact: academic perceptions of the implications for research funding, motivation and quality. British Politics, 13(3): 295–311
Article Google Scholar
Cui S, Gao Y, Huang Y, Shen L, Zhao Q, Pan Y, Zhuang S (2023). Advances and applications of machine learning and deep learning in environmental ecology and health. Environmental Pollution, 335(10): 122358
Article CAS PubMed Google Scholar
Feldman M, Friedler S, Moeller J, Scheidegger C, Venkatasubramanian S (2014). Certifying and removing disparate impact. arXiv.1412.3756
Gauchat G (2012). Politicization of science in the public sphere: a study of public trust in the United States, 1974 to 2010. American Sociological Review, 77(2): 167–187
Article Google Scholar
Gibert K, Horsburgh J S, Athanasiadis I N, Holmes G (2018). Environmental data science. Environmental Modelling & Software, 106: 4–12
Article Google Scholar
Grineski S, Bolin B, Boone C (2007). Criteria air pollution and marginalized populations: environmental inequity in metropolitan Phoenix, Arizona. Social Science Quarterly, 88(2): 535–554
Article Google Scholar
Gundersen O E, Coakley K, Kirkpatrick C, Gil Y (2022). Sources of irreproducibility in machine learning: a review.ArXiv, abs/2204.07610
Hardt M, Price E, Srebro N (2016). Equality of opportunity in supervised learning. Proceedings of the 30th International Conference on Neural Information Processing Systems, 3323–3331
Hinnefeld J H, Cooman P, Mammo N, Deese R (2018). Evaluating Fairness Metrics in the Presence of Dataset Bias. arXiv.1809.09245
IEEE (2020). Bejing: IEEE Recommended Practice for Assessing the Impact of Autonomous and Intelligent Systems on Human Well. IEEE Std 7010-2020, 1–96
Joshi B, Swarnakar P (2023). How fair is our air? The injustice of procedure, distribution, and recognition within the discourse of air pollution in Delhi, India. Environmental Sociology, 9(2): 176–189
Article Google Scholar
Liu X, Lu D, Zhang A, Liu Q, Jiang G (2022). Data-driven machine learning in environmental pollution: gains and problems. Environmental Science & Technology, 56(4): 2124–2133
Article ADS CAS Google Scholar
Lokers R, Knapen R, Janssen S, van Randen Y, Jansen J (2016). Analysis of Big Data technologies for use in agro-environmental science. Environmental Modelling & Software: With Environment Data News, 84(10), 494–504
Article Google Scholar
Monroe-White T, Lecy J (2022). The Wells-Du Bois Protocol for machine learning bias: building critical quantitative foundations for third sector scholarship. Voluntas, 34, 170–184
Article Google Scholar
Montoya L D, Mendoza L M, Prouty C, Trotz M, Verbyla M E (2020). Environmental engineering for the 21st century: increasing diversity and community participation to achieve environmental and social justice. Environmental Engineering Science, 38(5): 288–297
Article Google Scholar
Mowbray M, Savage T, Wu C, Song Z, Cho B A, Del Rio-Chanona E A, Zhang D (2021). Machine learning for biochemical engineering: a review. Biochemical Engineering Journal, 172: 108054
Article CAS Google Scholar
Murray S G, Wachter R M, Cucina R J (2020). Discrimination by artificial intelligence in a commercial electronic health record: a case study. Health Affairs Forefront
Petersen A M, Ahmed M E, Pavlidis I (2021). Grand challenges and emergent modes of convergence science. Humanities & Social Sciences Communications, 8(1): 194
Article Google Scholar
Prahl A, Goh W W P (2021). “Rogue machines” and crisis communication: When AI fails, how do companies publicly respond? Public Relations Review, 47(4): 102077
Article Google Scholar
Qian J, Wu W, Yu Q, Ruiz-Garcia L, Xiang Y, Jiang L, Shi Y, Duan Y, Yang P (2020). Filling the trust gap of food safety in food trade between the EU and China: an interconnected conceptual traceability framework based on blockchain. Food and Energy Security, 9(4): e249
Article Google Scholar
Ravetz J, Saltelli A (2015). The future of public trust in science. Nature, 524(7564): 161–161
Article CAS PubMed Google Scholar
Rockström J, Gupta J, Qin D, Lade S J, Abrams J F, Andersen L S, Armstrong McKay D I, Bai X, Bala G, Bunn S E, et al. (2023). Safe and just Earth system boundaries. Nature, 619(7968): 102–111
Article ADS PubMed PubMed Central Google Scholar
Sorrentino R M, Yamaguchi S (2008). Handbook of Motivation and Cognition Across Cultures. San Diego: Academic
Google Scholar
Tae K H, Roh Y, Oh Y H, Kim H, Whang S E (2019). Data cleaning for accurate, fair, and robust models: a big data—AI integration approach. DEEM’19: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learningm, 30 June 2019, Amsterdam, Netherlands
Tahmasebi P, Kamrava S, Bai T, Sahimi M (2020). Machine learning in geo- and environmental sciences: from small to large scale. Advances in Water Resources, 142: 103619
Article Google Scholar
Tessum C W, Apte J S, Goodkind A L, Muller N Z, Mullins K A, Paolella D A, Polasky S, Springer N P, Thakrar S K, Marshall J D, et al. (2019). Inequity in consumption of goods and services adds to racial–ethnic disparities in air pollution exposure. Proceedings of the National Academy of Sciences of the United States of America, 116(13): 6001–6006
Article ADS CAS PubMed PubMed Central Google Scholar
Verlegh P W J, Steenkamp J BEM (1999). A review and meta-analysis of country-of-origin research. Journal of Economic Psychology, 20(5): 521–546
Article Google Scholar
Vesilind P A (2010). Engineering Peace and Justice the Responsibility of Engineers to Society. London: Springer-Verlag
Book Google Scholar
Vorst R V D (1998). Engineering, ethics and professionalism. European Journal of Engineering Education, 23(2): 171–179
Article Google Scholar
Wailoo K A, Dzau V J, Yamamoto K R (2023). Embed equity throughout innovation. Science, 381(6662): 1029–1029
Article ADS PubMed Google Scholar
Wen Y, Zhou Z, Zhang S, Wallington T J, Shen W, Tan Q, Deng Y, Wu Y (2022). Urban-rural disparities in air quality responses to traffic changes in a megacity of China revealed using machine learning. Environmental Science & Technology Letters, 9(7): 592–598
Article CAS Google Scholar
Zhu M, Wang J, Yang X, Zhang Y, Zhang L, Ren H, Wu B, Ye L (2022). A review of the application of machine learning in water quality evaluation. Eco-Environment & Health, 1(2): 107–116
Article Google Scholar
Zliobaite I (2015). On the relation between accuracy and fairness in binary classification. In: The 2nd Workshop on Fairness, Accountability, and Transparency in Machine Learning (FATML) at ICML’15, July 11, 2015, Lille, France

Download references

Acknowledgements

I would like to thank the National Science Foundation of the USA for Facilitating Funded Network Building, Convergence Exploration, and Equity Concept Development (Nos. 2115405, 2241237, and 2115453). Each of these funded efforts were distinctly meaningful in the development of this perspective.

Author information

Authors and Affiliations

School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
Joe F. Bozeman III
School of Public Policy, Georgia Institute of Technology, Atlanta, GA, 30332, USA
Joe F. Bozeman III

Authors

Joe F. Bozeman III
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Joe F. Bozeman III.

Ethics declarations

Conflict of Interests Joe F. Bozeman III is Editorial Board Member of Frontiers of Environmental Science & Engineering. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Highlights

• Socioecological inequity must be understood to improve environmental data science.

• The Systemic Equity Framework and Wells-Du Bois Protocol mitigate inequity.

• Addressing irreproducibility in machine learning is vital for bolstering integrity.

• Future directions include policy enforcement and systematic programming.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bozeman, J.F. Bolstering integrity in environmental data science and machine learning requires understanding socioecological inequity. Front. Environ. Sci. Eng. 18, 65 (2024). https://doi.org/10.1007/s11783-024-1825-2

Download citation

Received: 13 September 2023
Revised: 03 January 2024
Accepted: 10 January 2024
Published: 08 February 2024
DOI: https://doi.org/10.1007/s11783-024-1825-2

Keywords

Part of a collection:

Artificial Intelligence/Machine Learning on Environmental Science & Engineering

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bolstering integrity in environmental data science and machine learning requires understanding socioecological inequity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Supporting Cross-Domain System-Level Environmental and Earth Science

Data Science in Environmental Health Research

Enhancing environmental decision-making: a systematic review of data analytics applications in monitoring and management

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Highlights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now