site reliability EngineerYondu
Handle service monitoring, incident response, and drive technical support efficiency
Responsible for managing and maintaining network monitoring tools, systems, and
processes that ensure the availability, scalability, and performance of our productionenvironments.
Responsible for incident handling, service monitoring, and technical support efficiency.
Closely work with developers, DevOps, infrastructure teams, and different stakeholders
to achieve proactive incident prevention, issue resolution and incident documentations.Key Responsibilities:
Ensure that all tickets are updated and handled based on set KPI's and SLA's
Manage monitoring, alerting, and logging tools to ensure system health and service
uptime.
Ensure early detection, triage and escalation of service degradation based on defined
service level agreement
Trigger L2 ticket handling and on-call rotations for critical incidents.
Execute triage, diagnosis, and resolution of incidents required for L3 escalations, both
internal and 3rd party support teams
Support major incident response, contribute to root cause analysis (RCA), and help
document postmortems.
Track, analyze, and act on incident trends and recurring technical issues.
Use data from ticketing systems (Jira, ServiceNow, etc.) to improve team responsiveness
and resolution quality.
Update and maintain SOPs, runbooks, and knowledge base articles including the
documentation of known issues, fixes, and playbooks to improve mean time to resolution.
Collaborate with development and QA teams to improve deployment readiness and
reliability
Participate in technical competency mapping to ensure coverage and reduce unnecessaryescalations.
Qualifications and Experience:
Bachelor's degree in Electronics Engineering, Information Technology, Computer
Science, Management Information Systems, or equivalent.
2-5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles.
Minimum of 3 years' experience in Site Reliability Engineering, DevOps, or Infrastructure roles is required.
Hands-on experience with monitoring tools (e.g., Prometheus, Grafana, ELK, or Datadog).
Familiarity with incident response and troubleshooting in production systems.
Experience with at least one cloud platform (AWS, GCP, or Azure).
Knowledgeable in scripting (e.g., Python, Bash) and Linux systems.
Exposure to ITIL-based processes, especially Incident and Problem Management.
Experience working in fintech, banking, or SaaS with high availability SLAs.
Familiarity with DevOps practices, CI/CD pipelines, and cloud-based monitoring tools.
Experience with automation platforms
Knowledge of BSP regulatory frameworks, policies, and guidelines.
Entry Level / Junior, Apprentice IT and Software Information Technology / IT 1 opening Bachelor's degree graduateYondu is a Philippine-based IT solutions company owned by Globe Telecom. We empower businesses across various industries through a wide array of innovative technology solutions to help them scale in the new digital economy. Our mission is to create happier technological experiences by turning great ideas into excellent and valuable business solutions.If you're looking to advance your career in the IT industry, Yondu is the best place to be. You will be a part of a young, dynamic culture that always pursues innovation and growth. As a Yondude, you'll gain fresh perspectives from a team of knowledgeable and competitive individuals and learn how to develop cutting-edge business solutions that go above and beyond. You'll also enjoy collaborating with brilliant and fun people who are ready to take on the world.Established in 2001, our company has grown exponentially, and we're always on the lookout for highly skilled and competitive individuals to innovate and evolve with. If this sounds exciting for you, #BeAYondude, and leverage your expertise to help us achieve our ultimate goal of going above and beyond!Join us and grow your career in the IT industry!
Kalibrr
Other Info
Permanent
Full-time
Submit profile
Yondu
About the company
Yondu jobs
Metro Manila
Senior Specialist, Site Reliability Engineering
London Stock Exchange Group
MetroManila, Manila, TaguigAgreement
Tech Lead, Site Reliability Engineering
London Stock Exchange Group
MetroManila, Manila, TaguigAgreement
Language Data Specialist | Open to Fresh Graduates
MetroManila, Manila, TaguigAgreement
Revenue Accountant | Mid level
MetroManila, Manila, TaguigAgreement
MetroManila, Manila, TaguigAgreement
Fullstack Developer ASP.NET and Reactjs
MetroManila, Manila, TaguigAgreement
Cebu, CebuAgreement
MetroManila, Manila, TaguigAgreement
CPA with Assurance or Audit or Tax Sr Associate
MetroManila, Manila, TaguigAgreement
Dynamics Business Central Developer or Consultant
MetroManila, Manila, TaguigAgreement
MetroManila, Manila, TaguigAgreement
ZambalesAgreement