Principal Site Reliability Engineer Job at salesforce, San Francisco, CA

S3c3QmE2S1RrSnI2M1BWbHF0c3RJRDdvQmc9PQ==
  • salesforce
  • San Francisco, CA

Job Description

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category: Software Engineering About Salesforce: We’re Salesforce, the Customer Company, inspiring the future of business with AI + Data + CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place. Job Details: (Lead/Principal/Architect) Software Engineer - Availability Engineering Our Availability engineering teams are responsible for driving ‘best in class’ availability. You will work with delivery teams deploying customer-facing/supporting software across a multi-substrate engineering platform that collectively ships hundreds of features to production for tens of millions of users across all industries every day. Our users count on our applications and platforms to be highly reliable, lightning fast, supremely secure, and to preserve all of their customizations and integrations every time we ship. You will need deep experience with concurrency, large scale systems, proficiency with solving real-world data management challenges, a strong understanding of how to craft solutions that are highly available, and a proven ability to design, develop, and optimize the core back-end systems. What you’ll be doing: As part of a specialist unit focused on availability and resilience, you will embed with delivery teams, acting in a Lead capacity, creating bandwidth and prioritizing a focus on corrective and proactive availability measures. You will be contributing to designing, developing, debugging, and operating resilient applications and platforms deployed across distributed systems that run across thousands of compute nodes in multiple data centers. You will champion resiliency best practices; observability tool integration, horizontal/vertical sizing & auto-scaling, release rollback & recovery workflows, integration tests and validation procedures for applications running on self-host infra as well as public cloud platforms such as AWS, GCP, Azure & Alibaba. Using and contributing to open source technology (Spinnaker, Zookeeper, etc.). Developing/leverage Infrastructure-as-Code using Terraform. Building/integrating with APIs and microservices deployed on containerization frameworks such as Kubernetes, Docker, Mesos, etc. Resolving complex technical issues and driving innovations that improve system availability, resilience, and performance. You have experience balancing live runtime management, feature delivery, and retirement of technical debt. Participate in the team’s on-call rotation to address complex problems in real-time and keep services operational and highly available. Required Skills: A related technical degree required, (masters preferred). 15+ years of hands-on software development experience. 5+ years in a Tech Lead, Principal or Architect capacity. Ability to reverse engineer solutions via independent code and architecture review, envision, define and then contribute to delivery of availability improvement refactoring projects. Mastery of one or more object-oriented delivery languages such as Java, Golang, APEX, Python. Deep experience working with core web technologies: JSON, REST, XML. Proficiency with databases including Oracle or other relational and/or NoSQL solutions. Experience owning and operating multiple instances of a critical service. Running critical infrastructure services; monitoring, alerting, logging, tracing and reporting. Subject matter expertise on Service ownership best practices, SLO/I/A definition, driving proactive operational awareness and experience with Incident/Problem management. Thorough knowledge of Agile development methodology with experience in both Test/Behavioral Driven Development practices. #J-18808-Ljbffr salesforce

Job Tags

Similar Jobs

Victoria’s Secret & Co.

Seasonal Associate-Capital Shopping Mall Job at Victoria’s Secret & Co.

Description Who are we? Victorias Secret & Co. (NYSE: VSCO) is a Fortune 500 specialty retailer of modern, fashion-inspired collections including signature bras, panties, lingerie, casual sleepwear, athleisure, and swim, as well as award-winning prestige fragrances...

Pursuit Collection

Custodian Job at Pursuit Collection

 ...bringing your best every day are the core values we live by in order to provide dynamic guest service at the Windsong Lodge. As the Custodian, you will provide daily cleaning throughout the lodge and employee housing common areas. What skills and experience do you need... 

RCM Healthcare Services

Inpatient Coder / Remote / Flexible Schedule Job at RCM Healthcare Services

Inpatient Coder IP Coder Location: Remote Schedule: Full-Time, Flexible Why work for RCM?Since 1975, RCM Health Care Services has proven to be a leading consulting and staffing firm matching expert talent to the nation's top healthcare institutions. RCM provides...

The UPS Store

Center Assistant Manager- Floating Bergen County Job at The UPS Store

 ...and business development.The ideal candidate has a post high school education (college coursework or a degree), two years of retail store operations experience, strong supervisory/managerial/leadership skills, excellent computer/internet/software knowledge, the physical... 

The UPS Store

Assistant Center Manager Job at The UPS Store

 ...professionalism throughout the center through the utilization of leadership by example. The ideal candidate has two years of retail store operations experience, strong supervisory/managerial/leadership skills, excellent computer/internet/software knowledge, the physical...