This is an engineering position in Inventory Data & Enterprise Application Services (IDEAS) Observability group that will play a key role in Architecting / Planning and developing solutions for various critical traditional compute environments, on-prem and public cloud Observability initiatives in the enterprise. Focus is on systems and application monitoring (log, metrics events) , covering existing and open source monitoring tooling, as well as public cloud monitoring services.
Key responsibilities:
Create a one-stop full-link diagnosis, indicator monitoring, intelligent alarming, diagnosis workflow and self-healing platform
Develop Metrics management standards and promote its application on business indicators and call links
Developing highly scalable and mission critical observability platform(s).
Extensive work with time series data, as well as critical application metrics e.g. java Jmx, thread pools, GC, etc
Develop visualizations in Grafana providing single pane views for end user experience, application, infrastructure & security
Collaborate with the business teams to develop metrics measuring the performance against initiatives and report on those to stakeholders.
Collaborate with the SRE and Application Integration teams to ensure there is a convergence of business, technical and security requirements
Key skills and requirements:
Experience working with various agents and collectors (Fluentbit, Prometheus Exporters, Open telemetry, Splunk)
Hands-on-experience on designing and building Enterprise Observability and/or AIOps platform.
Knowledge in Enterprise monitoring systems (such as AppDynamic, Dynatrace, Nagios, Sensu, Grafana, Prometheus etc)
Enterprise logging platform (such as ElasticSearch, Splunk, etc)
Automation scripting (such as Ansible, Terraform, Powershell, etc)
Strong knowledge in both application and infrastructure domain with ability to develop automation scripts/utilities
Good technical knowledge in implementing, troubleshoot, performance tuning of hardware, operating system, and system services.
Excellent command of written and spoken English.
A good team player and able to work effectively at all levels of an organization with the ability to influence others to move towards consensus
Solid Agile understanding / experience
Bachelor's degree in Computer Science/Information Technology or equivalent
In return, we offer:
Competitive salary & social benefits (e.g. private healthcare care, Benefit System, life insurance)
Work in a friendly and diversified environment, appreciating differences in style and perspective and using them to add value to decisions leading to organizational success
A great environment for learning new technology and tools, online and instructor led training opportunities
Working in a friendly, dynamic and multinational environment
Opportunity to have an influence on the way you perform your tasks - our teams are constantly looking for new and better ways and we encourage all improvement ideas
A chance to make a difference with various affinity networks and charity initiatives
Job ID: 103128
A Typical Work Day May Include: • Completing preventative, predictive, ...
Are you looking to elevate your cyber career? Your technical skills? Your opport...
Cargill Animal Nutrition is a global business that serves large-scale feed mill ...
Primary Duties / Responsibilities:â— Assist in daily operational troublesho...