System Engineer / DevOps Engineer

Building reliable Linux, AWS, and CI/CD infrastructure for production teams.

I manage high-availability systems across Linux, AWS, Nutanix, databases, application servers, monitoring, and release automation with a focus on uptime, speed, and disaster readiness.

Production Reliability Snapshot
0%+

Uptime SLA

0%

Deployment Time Reduced

0+

Years Experience

0/7

Monitoring Mindset

Git Jenkins Ansible AWS

About

Operations-focused engineer with production ownership across systems, cloud, and releases.

Sonali Rai is a results-driven System Engineer and DevOps professional with 3.5+ years of hands-on experience managing enterprise Linux infrastructure, AWS cloud environments, and Nutanix hyperconverged platforms.

Her work spans CI/CD pipeline automation, zero-downtime deployments, MySQL administration, infrastructure monitoring, disaster recovery, and security hardening. She focuses on practical DevOps implementation that improves system reliability, scalability, and delivery speed in production environments.

She has worked with production services that require disciplined release handling, reliable backups, clear monitoring, and fast troubleshooting across web, application, and database layers. Her portfolio highlights the type of engineering work that keeps business-critical systems stable while teams continue to ship.

Current CompanyValueFirst, A Twilio Company
LocationGurugram, Haryana
EducationB.Tech, Computer Science

Core Expertise

Hands-on ownership from server health to release delivery.

Reliability Engineering

Maintains Linux production infrastructure with uptime-focused operations, access controls, patching, hardening, incident response, and service health checks.

  • 99.9%+ uptime-oriented operations
  • CIS-style security hardening practices
  • Log rotation, backup jobs, and health scripts

Cloud Infrastructure

Builds and manages AWS infrastructure for secure, scalable, and fault-tolerant application environments across compute, network, storage, and database layers.

  • VPC public/private subnet architecture
  • ELB, EC2, EBS, S3, RDS, IAM, CloudWatch
  • AMI and snapshot lifecycle automation

Release Automation

Improves delivery speed through Jenkins and Ansible pipelines, scripted deployments, rolling release methods, and reduced manual intervention during production releases.

  • Weekly release automation
  • Zero-downtime deployment approach
  • Tomcat to WildFly migration support
Plan Infrastructure changes, access, rollout path
Automate Jenkins, Ansible, Bash, AWS CLI
Release Rolling deploys, validation, rollback readiness
Observe CloudWatch, ELK, Datadog, Site24x7

Technical Skills

Infrastructure, cloud, automation, and observability toolkit.

Systems

CentOS, Ubuntu, RHEL, Windows Server, patching, user access, troubleshooting, service health, log rotation, and hardening.

LinuxRHELHardening

AWS Cloud

EC2, S3, RDS, IAM, ELB, VPC, CloudWatch, EBS, AMI, Route 53, Multi-AZ failover, lifecycle policies, and secure account access.

EC2VPCRDSIAM

DevOps & CI/CD

Jenkins, Ansible, Git, CI/CD pipelines, infrastructure as code, release orchestration, deployment validation, and repeatable automation.

JenkinsAnsibleGit

Application Stack

JBoss EAP, WildFly, Apache Tomcat, Nutanix AHV, VMware familiarity, Docker familiarity, and Kubernetes exposure for modern platform readiness.

WildFlyTomcatNutanix

Databases

MySQL master-master replication, RDS administration, scheduled backups, query optimization, MongoDB, and ClickHouse for high-throughput workloads.

MySQLRDSMongoDB

Monitoring & DR

ELK Stack, Datadog, Site24x7, CloudWatch alarms, Citrix Load Balancer dashboards, backup optimization, and RTO/RPO-focused DR support.

ELKDatadogDR

Experience

Production engineering at scale.

System Engineer / DevOps Engineer

Sept 2022 - Present

ValueFirst, A Twilio Company - Gurugram, Haryana

Responsible for production infrastructure operations across Linux servers, AWS cloud services, application servers, databases, monitoring platforms, and release pipelines supporting ValueFirst voice and messaging services.

Linux Infrastructure

Administered CentOS and Ubuntu production workloads, maintained 99.9%+ uptime SLAs, and handled hardening, patching, access control, and troubleshooting.

  • Managed JBoss, WildFly, Tomcat, MySQL, MongoDB, and ClickHouse environments.
  • Improved routine operations with Bash scripts for backups, logs, snapshots, and health checks.

Release Automation

Designed Jenkins and Ansible CI/CD pipelines, automated weekly releases, and reduced manual deployment steps with rolling deployment practices.

  • Reduced deployment time by approximately 60% through automation.
  • Supported zero-downtime releases across JBoss and WildFly services.

AWS Engineering

Built and managed EC2, EBS, VPC, IAM, ELB, S3, CloudWatch, RDS, AMI, NAT gateways, route tables, and secure multi-tier architecture.

  • Designed public/private subnet layouts, VPC peering, and secure routing.
  • Managed least-privilege IAM roles, users, groups, and policies.

Operations & DR

Maintained AWS DR infrastructure, monitored systems with Datadog, ELK, Site24x7, and supported ValueFirst VOICE and Infinito platforms.

  • Configured CloudWatch dashboards, metric alarms, and log groups.
  • Maintained backup and failover practices for critical services.

Impact Work

Resume achievements presented as engineering outcomes.

01

CI/CD Automation

Automated weekly production releases using Jenkins and Ansible, eliminating manual release steps and reducing deployment time by approximately 60%.

Tools: Jenkins, Ansible, Git, Bash
02

Zero-Downtime Migration

Led migration from Apache Tomcat to WildFly with rolling deployment strategy, improving production stability and maintainability.

Focus: App servers, release safety, production validation
03

AWS Three-Tier Architecture

Designed secure web, application, and database layers with VPC networking, load balancing, IAM controls, and fault-tolerant cloud infrastructure.

Stack: VPC, EC2, ELB, RDS, IAM, CloudWatch
04

Monitoring & Disaster Recovery

Implemented dashboards, alarms, backups, AMI snapshots, EBS lifecycle automation, and DR systems aligned with production RTO/RPO needs.

Coverage: Observability, backup, failover readiness

Education

Computer science foundation with continuous DevOps learning.

B.Tech - Computer Science & Engineering

APJ Abdul Kalam Technical University

2018 - 2022

12th CBSE / 10th CBSE

Children Senior Secondary School, Azamgarh

Continuously upskilling in Kubernetes, Terraform, and Docker containerization.

Contact

Open to DevOps, cloud infrastructure, and system engineering opportunities.