Fault-Tolerant Architecture for Real-Time Payment Processing in Large-Scale Telecom Billing Systems

Authors

  • Parth Patel Independent Researcher, Pennsylvania, USA. Author

DOI:

https://doi.org/10.63282/3050-922X.IJERET-V4I1P122

Keywords:

Payment Gateway Integration, Fault Tolerance, Cybersource, High Availability, Idempotency, Circuit Breaker Pattern, Telecom Billing Systems, Microservices Architecture, PCI DSS Compliance, Real-Time Transaction Processing, Exponential Backoff, Apache Kafka, Distributed Systems Resilience

Abstract

Payment processing systems that serve large consumer-facing organizations must handle millions of transactions per day while maintaining consistent reliability. A single point of failure in the payment pipeline can result in measurable revenue loss and customer dissatisfaction. This paper presents a fault-tolerant architecture for integrating CyberSource payment services within a large-scale Internet Service Provider (ISP) billing platform that handles payments for over three million active customer accounts. The system achieves high availability through a combination of idempotent API design, exponential backoff retry logic, a circuit breaker mechanism calibrated for payment gateway behavior, asynchronous event queuing using Apache Kafka, and distributed transaction tracing. The architecture reduces mean recovery time from over eighteen minutes to under one minute, eliminates duplicate charge incidents through idempotency enforcement, and sustains 99.94% transaction success rates during observed periods of CyberSource API instability. This paper describes the design decisions behind each component, the CyberSource REST API integration approach, lessons learned from operating this system in production and observed performance data before and after the fault-tolerant design was adopted.

References

[1] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques. San Francisco, CA: Morgan Kaufmann Publishers, 1992.

[2] M. Kleppmann, Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. Sebastopol, CA: O'Reilly Media, 2017.

[3] M. Nygard, Release It! Design and Deploy Production-Ready Software, 2nd ed. Raleigh, NC: Pragmatic Bookshelf, 2018.

[4] C. Richardson, Microservices Patterns: With Examples in Java. Shelter Island, NY: Manning Publications, 2018.

[5] M. Fowler, Patterns of Enterprise Application Architecture. Boston, MA: Addison-Wesley, 2002.

[6] PCI Security Standards Council, Payment Card Industry Data Security Standard (PCI DSS), Requirements and Security Assessment Procedures, Version 3.2.1, May 2018. [Online]. Available: https://www.pcisecuritystandards.org/documents/PCI_DSS_v3-2-1.pdf

[7] CyberSource Corporation, CyberSource REST API Developer Guide, 2022. [Online]. Available: https://developer.cybersource.com/api/reference

[8] Visa Inc., CyberSource Token Management Service (TMS) Developer Guide, 2022. [Online]. Available: https://developer.cybersource.com/api/developer-guides/dita-tms/

[9] Amazon Web Services, Building Fault-Tolerant Applications on AWS, AWS Whitepaper, 2021. [Online]. Available: https://docs.aws.amazon.com/whitepapers/latest/fault-tolerant-components/fault-tolerant-components.html

[10] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. Fernandez-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle, "The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing," Proc. VLDB Endow., vol. 8, no. 12, pp. 1792-1803, Aug. 2015.

[11] H. Garcia-Molina and K. Salem, "Sagas," in Proc. ACM SIGMOD International Conference on Management of Data, San Francisco, CA, May 1987, pp. 249-259.

[12] R. Fielding and J. Reschke, Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, IETF RFC 7231, Jun. 2014. [Online]. Available: https://www.rfc-editor.org/rfc/rfc7231

[13] National Institute of Standards and Technology, FIPS PUB 180-4: Secure Hash Standard (SHS), Aug. 2015. [Online]. Available: https://doi.org/10.6028/NIST.FIPS.180-4.

Downloads

Published

2023-03-30

Issue

Section

Articles

How to Cite

1.
Patel P. Fault-Tolerant Architecture for Real-Time Payment Processing in Large-Scale Telecom Billing Systems. IJERET [Internet]. 2023 Mar. 30 [cited 2026 Jun. 11];4(1):210-7. Available from: https://ijeret.org/index.php/ijeret/article/view/613