InfoQ Homepage DevOps Content on InfoQ

Articles

RSS Feed

Newer Older

DevOps

Ceph RBD Turns 15: a Story of Open Source Creation

Fifteen years ago, Ceph RBD began as a community-driven idea that grew into essential infrastructure powering today's cloud platforms. This insider story from Yehuda Sadeh-Weinraub reveals how two developers started a distributed storage that now supports OpenStack and Kubernetes through transparent, collaborative development.

Yehuda Sadeh-Weinraub
on Jul 07, 2025
DevOps

Why Is My Docker Image So Big? A Deep Dive with ‘dive’ to Find the Bloat

AI images typically bloat from massive library installations and base OS components, with large Docker images slowing AI development and increasing costs. Chirag Agrawal demonstrates how to diagnose bloat using Docker's history and the interactive 'dive' tool to examine each layer in detail. The article shows how effective diagnosis leads to targeted optimizations.

Chirag Agrawal
on Jun 30, 2025
Cloud

Engineering Principles for Building a Successful Cloud-Prem Solution

Discover how Cloud-Prem solutions combine cloud efficiency with on-premise control, meeting data sovereignty and compliance demands while optimizing operational costs and enhancing customer security.

Satyam Dhar
on Jun 26, 2025
DevOps

Analyzing Apache Kafka Stretch Clusters: WAN Disruptions, Failure Scenarios, and DR Strategies

Proficient in analyzing the dynamics of Apache Kafka Stretch Clusters, I assess WAN disruptions and devise effective Disaster Recovery (DR) strategies. With deep expertise, I ensure high availability and data integrity across multi-region deployments. My insights optimize operational resilience, safeguarding vital services against service level agreement violations.

Srikanth Daggumalli Nishchai Jayanna Manjula
on Jun 20, 2025
Cloud

We Took Developers out of the Portal: How APIOps and IaC Reshaped Our API Strategy

Dynamic API strategist with expertise in transforming legacy management into efficient APIOps frameworks using Infrastructure as Code (IaC). Proven track record in automating API lifecycles, enhancing security, and fostering developer productivity through CI/CD integration. Adept at driving operational excellence and consistency across environments, enabling rapid deployment and innovation.

Balakrishna Sudabathula
on Jun 12, 2025
DevOps

Inflection Points in Engineering Productivity as Amazon Grew 30x

In this article, Carlos Arguelles elaborates on how engineering productivity needs a shift as organizations scale. He shares examples from his time at Google and Amazon, explaining how some architectural decisions made at these companies shaped the way they develop software. Engineering productivity investments depend on inflection points, scale, controls, data, and tooling choices.

Carlos Arguelles
on May 26, 2025
Cloud

Distributed Cloud Computing: Enhancing Privacy with AI-Driven Solutions

Distributed cloud, PETs, and AI enable secure, private data processing. This integration enhances collaboration, security, and compliance across marketing, finance, and healthcare, addressing the growing need for data protection.

Rohit Garg Ankit Awasthi
on Apr 25, 2025
Cloud

DiRMA: Measuring How Your Organization Manages Chaos

Elevate your disaster recovery strategy with DiRMA—an innovative framework for assessing and enhancing Disaster Recovery Testing (DiRT) maturity across people, processes, and tools. As chaos engineering becomes essential for resilience, DiRMA guides organizations through structured improvement, addressing cultural hurdles and ensuring robust recovery readiness in the face of modern challenges.

Yury Niño Roa
on Mar 28, 2025
DevOps

Checklist for Kubernetes in Production: Best Practices for SREs

This article provides SREs with a checklist for managing Kubernetes in production environments. It identifies common challenges including resource management, workload placement, high availability, health probes, storage, monitoring, and cost optimization. By implementing consistent GitOps automation across these areas, teams can significantly reduce complexity, and prevent downtime.

Utku Darilmaz
on Mar 10, 2025
Architecture & Design

2025 Article Contest: Win Your Conference Ticket

The InfoQ Team is excited to invite you to participate in our annual article writing competition. Authors of top-rated articles will win complimentary tickets to prominent software development conferences such as QCon and InfoQ Dev Summit.

InfoQ
on Feb 17, 2025
Cloud

Being Functionless: How to Develop a Serverless Mindset to Write Less Code!

Dynamic cloud services like AWS Lambda have revolutionized computing, leading to rapid deployment and innovation in serverless technology. However, over-reliance on Functions as a Service (FaaS) can create complex architectures and increase costs. Adopting a functionless mindset and leveraging native service integrations fosters simplicity, enhances sustainability, and optimizes efficiency.

Sheen Brisals
on Jan 03, 2025
Architecture & Design

Taking Advantage of Cell-Based Architectures to Build Resilient and Fault-Tolerant Systems

Cell-based architectures offer a robust approach to building resilient systems. They achieve this through the core principles of isolation, autonomy, and replication. Each cell manages its resources and makes decisions autonomously. Observability for cell-based architecture requires a tailored approach to address the unique challenges and opportunities presented by this distributed system design.

Yury Niño Roa
on Oct 21, 2024

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles