Responsibilities:

  • Understand, automate and scale Helpshift cloud infrastructure. This will involve learning and working with various cloud technologies, scripting languages, and CM tools.
  • Own Helpshift production services and ensure complete monitoring coverage, troubleshoot and fix production issues.
  • Architect and implement projects that reduce or eliminate repetitive and administrative tasks.
  • Ensure all services and infrastructure are Highly Available, with Disaster Recovery in place.
  • Performance engineering for backend services and stores such as Mongodb, Elasticsearch, Kafka, Haproxy, etc.
  • Work in a lean team, with focus on getting things done.
  • Requirements:

  • In-depth knowledge of running/managing UNIX-like operating systems (we use Ubuntu).
  • Good programming skills with focus on scripting (Python, Shell, Perl).
  • Good fundamental knowledge of networking (TCP/IP, firewalls, routing).
  • Knowledge of various FOSS tools for monitoring, graphing, capacity planning, logging, etc.
  • Some experience with automation tools like Puppet, Fabric, etc.
  • Some experience with Cloud Computing platforms like Google AppEngine, Amazon AWS, Heroku, etc.
  • Have an automation mindset and ability to reason and work with complex systems.
  • Ability to understand and work with databases (such as PostgreSQL), queuing systems (such as RabbitMQ, Kafka) and Hadoop would be a plus.
  • Looking for relevant experience of 4 years and above.