Job Description:
If you are passionate about playing a key role in the success of an Al-driven technology company, we want to hear from you!
Our client is a well-established brand in the IT services and consulting industry and they are looking for a passionate and driven Senior Site Reliability Engineer (SRE) to join their team.
Our client is seeking talented individuals who want to make a difference in the world, have a strong will of constant learning and development, are open and collaborative, and never stop striving to improve and develop both themselves and the products/ services they are responsible for. You'll partner with our Development team to create groundbreaking technology. This is an incredible opportunity to make a meaningful impact on the future of Digital Bank.
This is an exciting opportunity to expand your skillset, and achieve job satisfaction and work-life balance.
Responsibilities:
Participate in SRE software engineering, writing code for the continuing reduction of human intervention in operational tasks and automation of processes.
Lead in-depth technical and data analysis to gauge service trends and drive improvements.
Contribute to the prioritization of reliability features and contribute to the design, development, and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks.
Contribute to proactive technical communication of reliability, stability, and efficiency results (based on Service Level Objectives), service health (via dashboards) key reliability risks, and issues to senior business and technology stakeholders - to prioritize activity (based on trend analysis) and direct investment and action.
Design and development of IaC with Terraform/Pulumi.
Ensuring all infrastructure code is thoroughly tested in a CI pipeline.
Ensuring all infrastructure components are highly visible (monitoring, logging, alert).
Managing cloud environments in accordance with company security guidelines.
Troubleshooting incidents, identifying root causes, fixing and documenting problems, and implementing preventive measures.
Ensuring application performance, uptime, and scale, maintaining high standards of code quality and thoughtful design.
Document best practices regarding application deployment and infrastructure maintenance.
Work in tandem with our engineering team to identify and implement optimal solutions for their day-to-day task.
Provide guidance, and mentorship to development teams to build cloud competencies.
Provide 24/7 operation support for the digital platform.
Optimizing Kubernetes to maintain systems uptime.
debugging production issues and running run-books to mitigate potential production issues.
Help guide our overall strategy through design, prototyping, and market research.
Able to communicate with senior management and manage vendors.
Qualifications:
Degree in Computer Science, Engineering, or equivalent experience.
4+ years experience in software development and/or SRE functions with at least 2 years in a senior capacity.
You are either a Software Engineer with a real interest in systems, networking, monitoring, and automation; or an experienced sysadmin or systems engineer with professional skills in Linux, preferably on distributed systems at scale, and a demonstrable interest and experience in using software engineering to solve operational problems.
Comfortable writing software to automate API-driven tasks at scale. Cloud Tooling engineers primarily use NodeJS and /or Java, Kotlin and Go are also key languages in our environment.
Experience automating the build and deployment of software products, and understanding the related challenges in distributed systems.
Very good knowledge and experience with architecting and provisioning Cloud-based infrastructure on Google GCP or Amazon AWS, Ms. Azure.
Excellent communication (both verbal and written). The ability to communicate confidently and clearly on conference calls, in meetings, via email, etc. at all levels of the organization is essential.
Ability to quickly and clearly communicate incident status via email in business-friendly language.
Experience and advanced understanding of Observability, CI/CD, and release management.
Well-rounded broad knowledge of OS platforms (Linux/UNIX), Networking, Web Systems, and DevOps.
Experience working with large-scale distributed systems with an understanding of microservices architecture concepts.
Strong organizational skills and the ability to effectively manage multiple tasks simultaneously.
Capable of working in a complex, fast-paced environment and ability to maintain calm during stressful situations.
Benefits
Health insurance with 100% premium covered
Generous PTO / sick leave
Etc.
WHAT'S ON OFFER
You will be remunerated with an excellent base salary and entitled to attractive company benefits. Additionally, you will get the opportunity to enjoy a fun and collaborative work environment, alongside a strong career progression. To submit your application, please apply online or email your UPDATED CV in Microsoft Word format to .
Your interest will be treated with strict confidentiality.
TechBridge Market
If you are passionate about playing a key role in the success of an Al-driven technology company, we want to hear from you!
Our client is a well-established brand in the IT services and consulting industry and they are looking for a passionate and driven Senior Site Reliability Engineer (SRE) to join their team.
Our client is seeking talented individuals who want to make a difference in the world, have a strong will of constant learning and development, are open and collaborative, and never stop striving to improve and develop both themselves and the products/ services they are responsible for. You'll partner with our Development team to create groundbreaking technology. This is an incredible opportunity to make a meaningful impact on the future of Digital Bank.
This is an exciting opportunity to expand your skillset, and achieve job satisfaction and work-life balance.
Responsibilities:
Participate in SRE software engineering, writing code for the continuing reduction of human intervention in operational tasks and automation of processes.
Lead in-depth technical and data analysis to gauge service trends and drive improvements.
Contribute to the prioritization of reliability features and contribute to the design, development, and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks.
Contribute to proactive technical communication of reliability, stability, and efficiency results (based on Service Level Objectives), service health (via dashboards) key reliability risks, and issues to senior business and technology stakeholders - to prioritize activity (based on trend analysis) and direct investment and action.
Design and development of IaC with Terraform/Pulumi.
Ensuring all infrastructure code is thoroughly tested in a CI pipeline.
Ensuring all infrastructure components are highly visible (monitoring, logging, alert).
Managing cloud environments in accordance with company security guidelines.
Troubleshooting incidents, identifying root causes, fixing and documenting problems, and implementing preventive measures.
Ensuring application performance, uptime, and scale, maintaining high standards of code quality and thoughtful design.
Document best practices regarding application deployment and infrastructure maintenance.
Work in tandem with our engineering team to identify and implement optimal solutions for their day-to-day task.
Provide guidance, and mentorship to development teams to build cloud competencies.
Provide 24/7 operation support for the digital platform.
Optimizing Kubernetes to maintain systems uptime.
debugging production issues and running run-books to mitigate potential production issues.
Help guide our overall strategy through design, prototyping, and market research.
Able to communicate with senior management and manage vendors.
Qualifications:
Degree in Computer Science, Engineering, or equivalent experience.
4+ years experience in software development and/or SRE functions with at least 2 years in a senior capacity.
You are either a Software Engineer with a real interest in systems, networking, monitoring, and automation; or an experienced sysadmin or systems engineer with professional skills in Linux, preferably on distributed systems at scale, and a demonstrable interest and experience in using software engineering to solve operational problems.
Comfortable writing software to automate API-driven tasks at scale. Cloud Tooling engineers primarily use NodeJS and /or Java, Kotlin and Go are also key languages in our environment.
Experience automating the build and deployment of software products, and understanding the related challenges in distributed systems.
Very good knowledge and experience with architecting and provisioning Cloud-based infrastructure on Google GCP or Amazon AWS, Ms. Azure.
Excellent communication (both verbal and written). The ability to communicate confidently and clearly on conference calls, in meetings, via email, etc. at all levels of the organization is essential.
Ability to quickly and clearly communicate incident status via email in business-friendly language.
Experience and advanced understanding of Observability, CI/CD, and release management.
Well-rounded broad knowledge of OS platforms (Linux/UNIX), Networking, Web Systems, and DevOps.
Experience working with large-scale distributed systems with an understanding of microservices architecture concepts.
Strong organizational skills and the ability to effectively manage multiple tasks simultaneously.
Capable of working in a complex, fast-paced environment and ability to maintain calm during stressful situations.
Benefits
Health insurance with 100% premium covered
Generous PTO / sick leave
Etc.
WHAT'S ON OFFER
You will be remunerated with an excellent base salary and entitled to attractive company benefits. Additionally, you will get the opportunity to enjoy a fun and collaborative work environment, alongside a strong career progression. To submit your application, please apply online or email your UPDATED CV in Microsoft Word format to .
Your interest will be treated with strict confidentiality.
TechBridge Market
Other Info
Makati City, Metro Manila
Permanent
Full-time
Permanent
Full-time
Submit profile
TechBridge Market
About the company
Position senior site reliability Engineer (sre) recruited by the company TechBridge Market at MetroManila, Manila, Makati, Joboko automatically collects the salary of , finds more jobs on Senior Site Reliability Engineer (SRE) or TechBridge Market company in the links above
About the company