AI-Native and Agentic Data Governance: From Rule-Based Policies to Self-Healing Metadata Systems

Authors

  • Kuladeep Sandra Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V7I2P106

Keywords:

AI Governance, Automated Policy Enforcement, Intelligent Data Catalog, Active Metadata, LLM-Driven Governance, Policy Violation Detection, Metadata Enrichment

Abstract

Data governance has progressed through three distinct generations. The first was manual: stewards, policy documents, spreadsheets, quarterly reviews. The second was rule-based automation: policy engines, schema contracts, automated access control derived from classification metadata. The third—emerging now—is AI-native: large language models and agentic systems that infer governance attributes, detect violations, and propose remediation autonomously. This paper surveys the progression and presents a position on what AI-native governance can and cannot do, grounded in operational experience running system-enforced governance in a 30-engineer data organization that achieved a 45 percent error reduction through deterministic automation alone. The case study evidence includes 500,000 files consolidated to 3,000 governed assets, $1.4 million in annual storage savings, 93 percent registration compliance and 87 percent classification coverage—establishing the baseline against which AI-native enhancements should be evaluated. This paper examines three research questions: how each generation of governance has worked and where each has failed; what AI-native techniques can productively add to the rule-based foundation; and what guardrails are non-negotiable when applying LLM-based agents to governance decisions with regulatory consequences. The central argument is that AI-native governance is an evolutionary step rather than a replacement, that the residual cases where deterministic automation falls short are exactly where LLMs offer the most value, and that the disciplined use of AI in governance requires accepting that some categories of decisions are properly human and should remain so.

References

[1] K. Sandra, "Data modernization and governance case study," companion paper KD-09, Unpublished manuscript, 2026. Google Scholar

[2] K. Sandra, "Storage optimization through deterministic governance," companion paper KD-14, Unpublished manuscript, 2026. Google Scholar

[3] K. Sandra, "Data quality metrics and governance impact," companion paper KD-20, Unpublished manuscript, 2026. Google Scholar

[4] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao, "ReAct: Synergizing reasoning and acting in language models," in Proc. Int. Conf. Learning Representations (ICLR), 2023. Google Scholar | Publisher Site

[5] T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom, "Toolformer: Language models can teach themselves to use tools," in Advances in Neural Information Processing Systems (NeurIPS), 2023. Google Scholar | Publisher Site

[6] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, "Chain-of-thought prompting elicits reasoning in large language models," in Advances in Neural Information Processing Systems (NeurIPS), 2022. Google Scholar | Publisher Site

[7] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, et al., "Language models are few-shot learners," in Advances in Neural Information Processing Systems (NeurIPS), 2020. Google Scholar | Publisher Site

[8] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, et al., "On the opportunities and risks of foundation models," arXiv:2108.07258, 2021. Google Scholar | Publisher Site

[9] A. Halevy, F. Korn, N. F. Noy, C. Olston, N. Polyzotis, S. Roy, and S. E. Whang, "Goods: Organizing Google's datasets," in Proc. ACM SIGMOD Int. Conf. Management of Data, 2016, pp. 795–806. Google Scholar | Publisher Site

[10] J. M. Hellerstein, V. Sreekanti, J. E. Gonzalez, J. Dalton, A. Dey, S. Nag, K. Ramachandran, S. Arora, A. Bhattacharyya, S. Das, M. Donsky, G. Fierro, C. She, C. Steinbach, V. Subramanian, and E. Sun, "Ground: A data context service," in Proc. Conf. Innovative Data Systems Research (CIDR), 2017. Google Scholar | Publisher Site

[11] DAMA International, DAMA-DMBoK: Data Management Body of Knowledge, 2nd ed. Basking Ridge, NJ: Technics Publications, 2017. Google Scholar | Publisher Site

[12] Microsoft, "Microsoft Purview documentation," 2024. [Online]. Available: https://learn.microsoft.com/en-us/purview/ Google Scholar | Publisher Site

[13] Apache Software Foundation, "Apache Ranger documentation," 2024. [Online]. Available: https://ranger.apache.org Google Scholar | Publisher Site

[14] Apache Software Foundation, "Apache Iceberg documentation," 2024. [Online]. Available: https://iceberg.apache.org Google Scholar | Publisher Site

[15] Confluent, "Confluent Schema Registry documentation," 2024. [Online]. Available: https://docs.confluent.io/platform/current/schema-registry/index.html Google Scholar | Publisher Site

[16] Trino Software Foundation, "Trino documentation," 2024. [Online]. Available: https://trino.io Google Scholar | Publisher Site

[17] European Parliament and Council, "Regulation (EU) on artificial intelligence (EU AI Act)," 2024. [Online]. Available: https://eur-lex.europa.eu Google Scholar | Publisher Site

[18] National Institute of Standards and Technology, AI Risk Management Framework (AI RMF 1.0), 2023. [Online]. Available: https://www.nist.gov/itl/ai-risk-management-framework Google Scholar | Publisher Site

[19] M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru, "Model cards for model reporting," in Proc. ACM Conf. Fairness, Accountability, and Transparency (FAT*), 2019, pp. 220–229. Google Scholar | Publisher Site

[20] M. Kleppmann, Designing Data-Intensive Applications. Sebastopol, CA: O'Reilly Media, 2017. Google Scholar | Publisher Site

Downloads

Published

2026-04-11

Issue

Section

Articles

How to Cite

1.
Sandra K. AI-Native and Agentic Data Governance: From Rule-Based Policies to Self-Healing Metadata Systems. IJERET [Internet]. 2026 Apr. 11 [cited 2026 Apr. 20];7(2):46-9. Available from: https://ijeret.org/index.php/ijeret/article/view/566