Lead Software Engineer
Company: Disneyland Hong Kong
Location: Burbank
Posted on: April 25, 2025
Job Description:
At Disney, we're storytellers. We make the impossible, possible.
The Walt Disney Company (TWDC) is a world-class entertainment and
technological leader. Walt's passion was to continuously envision
new ways to move audiences around the world-a passion that remains
our touchstone in an enterprise that stretches from theme parks,
resorts and a cruise line to sports, news, movies and a variety of
other businesses. Uniting each endeavor is a commitment to creating
and delivering unforgettable experiences - and we're constantly
looking for new ways to enhance these exciting experiences.Job
Summary:Are you passionate about working on large-scale,
high-performance software systems with a strong focus on service
reliability? Join our team as a Lead Software Engineer dedicated to
ensuring the availability, scalability, and resilience of our
cloud-based services. In this role, you'll leverage your expertise
in service reliability and cloud platforms to create scalable
solutions that support the entire Disney Parks, Experiences, and
Products segment. You will collaborate with cross-functional teams,
offering technical expertise to develop products and tooling, while
ensuring the smooth delivery of features and capabilities for the
Data Platform and Products team, ultimately enhancing the magic of
our guests' experiences.Responsibilities:
- Service Reliability: Establish key service-level indicators
(SLIs) and continuously monitor them to ensure system reliability,
availability, and performance. Proactively develop alerts and
automated responses to prevent service degradation or outages.
- Lead Cross-Functional Projects: Drive complex projects across
multiple teams and disciplines, ensuring high availability,
resilience, and minimal downtime of services.
- Root Cause Analysis & Incident Response: Conduct deep analysis
of system issues, identify root causes, and define actionable
strategies for remediation. Lead post-mortem analysis and
continuous improvement efforts.
- Infrastructure & System Optimization: Focus on the high
availability, scalability, and performance of services in
production environments, ensuring they meet business and customer
needs.
- Capacity Planning & Right-Sizing: Lead efforts to ensure that
services are properly scaled for current and future workloads.
Engage in capacity planning to optimize resource utilization.
- Documentation & Runbooks: Maintain detailed documentation and
create robust runbooks for incident management and troubleshooting,
ensuring smooth responses to service disruptions.
- Cloud Infrastructure & IaC: Utilize tools such as Terraform and
AWS CDK to manage and automate infrastructure as code (IaC) in a
cloud-native environment.
- Service Monitoring & Observability: Oversee observability and
monitoring across the platform (AWS, serverless, containers,
Snowflake, etc.), ensuring actionable insights are available for
operational teams.
- Collaboration with Development Teams: Work closely with
developers to ensure that applications are designed for service
reliability, scalability, and maintainability.
- GitOps-Driven Environment: Drive infrastructure changes and
service deployment using GitOps practices to ensure consistency and
traceability in deployments.
- Code Quality: Write clean, performant, and well-documented
application code with a focus on reliability and service
availability.
- Automation & Tooling: Build and maintain automated deployment
pipelines and tooling for monitoring and platform testing.
- Mentorship: Provide technical guidance and mentorship to junior
engineers and assist in technical decision-making and code
reviews.Qualifications:
- Experience: 7+ years of experience in software development,
with at least 3 years focused on service reliability, data
management, and distributed systems.
- Technical Expertise: Expert-level coding skills in Python, with
a deep understanding of performance and resource optimization,
including considerations of space and time complexity.
- Cloud Infrastructure: Strong experience with AWS services such
as Lambda, ECS Fargate, S3, VPC, Kinesis, EventBridge, and related
components.
- Serverless & Containers: Proven experience in working with
serverless (AWS Lambda) and containerized environments (Docker,
ECS, EKS).
- Infrastructure as Code (IaC): Proficiency with IaC tools such
as Terraform and AWS CDK to automate infrastructure
management.
- CI/CD & DevOps: Extensive experience in CI/CD pipelines,
automated testing, and general DevOps practices for continuous
integration and deployment.
- Observability: Hands-on experience with observability and
monitoring tools such as DataDog, AppDynamics, New Relic, or
similar suites.
- Agile & DevOps: Strong proponent of Agile methodologies and
application of DevOps principles for continuous
improvement.Education:
- Bachelor's degree in Computer Science, Engineering, Information
Technology, or related field, or equivalent experience.Preferred
Qualifications:
- Certifications in AWS, Snowflake, or relevant service
reliability tools.
- In-depth experience with observability suites and Application
Performance Management (APM) tooling such as DataDog, AppDynamics,
New Relic, etc.#DISNEYTECH#LI-AF2
#J-18808-Ljbffr
Keywords: Disneyland Hong Kong, Santa Clarita , Lead Software Engineer, IT / Software / Systems , Burbank, California
Didn't find what you're looking for? Search again!
Loading more jobs...