Software Engineer II
Datadog
Oct 2020 - Present
• 1 yr 10 mosSr. Cloud Engineer
Xylem Inc.
Jul 2019 - Oct 2020
• 1 yr 4 mos· Using Terraform to manage AWS Resources such as RDS, Redis, EC2 and EKS.
· Using Terraform to upgrade, manage multiple EKS clusters across different accounts.
· Managing AWS EKS clusters with Terraform, including patching Kubernetes versions and upgrading Kubernetes components.
· Using helm (v2 and v3) to deploy applications in Kubernetes with Bitbucket pipelines.
· Writing Ansible roles, playbooks, and modules. Testing roles with Molecule to ensure roles are working properly.
· Converted Salt states, pillars, etc to Ansible as part of migration from Salt to Ansible.
· Managing repositories within Jfrog Artifactory which houses deployments from many different sources.
· Creating and managed Bitbucket Pipelines to automate the deployments for various deployment teams.
· Managing Jenkins pipelines to automate builds for legacy pipelines.
· Managing Datadog, Sumologic to create and respond to alerts.
· Responding to on-call alerts and incidents.
Sr. Operations Engineer
Size Stream
Feb 2019 - Jul 2019
• 6 mos· Setup Datadog as an alerting tool and configured alerts with PagerDuty to alert the team of outages and application issues.
· Configured 10TB backup of SizeStream Body Scan data from a local NAS to AWS Glacier.
· Reconfigured the internal office network with 10GB switches and teamed (20GB) network cards to speed up backups of data, this reduced backup times from 23 to 28 hours to 6 to 8 hours.
· Managed over 150 Lambda functions within AWS built and deployed by the application team.
· Configured Gitlab Pipelines to automate the build and deployment of Lambda functions (Python 3) using Serverless Framework, an automation platform used to deploy resources into multiple cloud environments and to build and deploy .NET and C++ applications.
· Managed existing deployments which used Chalice, which is a tool written by AWS to deploy Lambda functions.
· Wrote Ansible Playbooks to automate the configuration of servers which will be used as Gitlab build environments. These servers were Windows and Linux since there are .NET and C++ applications being built.
· Managed Jfrog artifactory as a repository for Docker images and other applications.
· Managed backup of SQL databases within AWS RDS.
· Configured Teamviewer as a means for remote assistance tool for SizeStream Body Scanners customers use, this allows our technical support team to quickly resolve issues by logging into the machines remotely. Also wrote an Ansible Playbook to automate the installation of this onto new scanners and a Powershell script to install it onto existing scanners.
· Managed Technical Operations for the company and attended leadership meetings.
Software Engineer
Sauce Labs
Jan 2017 - Feb 2019
• 2 yrs 2 mos· Manage pipelines for OS images such as Windows, IOS and Linux using Jenkins and Packer. These pipelines are used to automate creation of VM images to be used in QEMU in a private cloud.
· Created a proof-of-concept Microservice to replace a manual configuration file with Docker/Kubernetes/Helm, later assisted the re-write and deployment of this microservice.
· Automated the release of browser deployments for Windows, this solution helped deploy new browsers within 24 hours of release from previous time of one week per browser deployment
· Maintained existing and wrote new unit tests in Python, Server Spec and Selenium tests in Ruby to ensure code quality and performance.
· Wrote new features and fixed bugs in our Python code reported by customers as well as making performance improvements to our software.
· Maintained ansible playbooks that deploy our code, manage our S3 assets, build/prep our VM disks, and distribute our QEMU images.
· Maintained a feature that uploaded our test assets such as recordings and screenshots to Amazon S3.
· Used Sumo Logic to debug and troubleshoot issues our customers may be having our other issues that may affect our private cloud, microservices or filesystems.
· Using Ansible to deploy new OS images to thousands of QEMU hypervisors.
· Switched services which used long running servers to docker with Kubernetes clusters.
· Helped create a GKE environment with docker containers that launched selenium tests for our customers in headless mode.
· Maintained the pipeline that created and deployed our docker container images for our GKE environment.
Sr. Systems Engineer
LexisNexis
Apr 2015 - Jan 2017
• 1 yr 10 mosSr. Systems/DevOps Engineer where my duties are helping design and engineering new environments within AWS, setting up infrastructure and monitoring solutions, automation (Python, Bash, Powershell, C#, CloudFormation, Ansible, some Chef), managing AWS AMIs with packer, managing Linux and Windows servers (Amazon Linux, which is CentOS 6.5, and Windows Server 2012 R2), working with other teams to configure and set up AWS Direct Connect, and coming up with proof of concept for other tools and solutions.
- Manage Linux (Amazon Linux, CentOS 6.5) and Windows Servers (Windows Server 2012 R2) within an Amazon AWS environment.
- Working with multiple VPCs and updating them with AWS Cloudformation and Ansible, as well as launching new environments with combination of Cloudformation, and Ansible, in combination with Python, and Powershell scripts.
- Using Python and Ansible to automate tasks such as cleaning up various unused resources in AWS in order to keep costs under control.
- Automate the release of tools like SymmetricDS and Jaspersoft using Docker containers to launch many test environments.
- Help manage Splunk for multiple teams and troubleshooting Splunk connectivity (such as peering links) and firewall issues within Amazon AWS.
- Configuring and setting up Zabbix for both Windows and Linux servers. Also configuring PagerDuty for certain servers to ensure critical components are online and working properly by getting notified when issues occur.
- Setting up AppDynamics agents within AWS Windows Web Servers (Server 2012 R2) so that the product and development teams can view performance of their applications, especially after a release to see if any new performance issues appear.
- Maintaining custom CIS images for Linux and Windows and distributing these images to multiple product teams to ensure that hardened images are used in AWS.