HPC Architect job vacancy in Holliston, Massachusetts Atlantic Partners

Vacancy expired!

Job Description: HPC Architect this is offered as a full time position, or contract.

The HPC Cloud Architect will administer high performance scientific computing platforms, infrastructure, and support research projects.

Responsibilities:

Utilize your experience in multiple disciplines including high performance computing (HPC), cloud, architecture, design, network, security and systems to implement and provide advanced system engineering services to customer.
Manage, administer and support daily operation of computing systems both onsite and in the cloud.
Design, implement and maintain scalable High-Availability (HA) and Fault-Tolerant (FT) computing systems.
Following the best cloud computing practice by utilizing Amazon Virtual Private Cloud (VPC), Amazon Elastic Computing Cloud (EC2) and other advanced technical cloud features.
Investigate and provide technical options to managers and researchers for selecting effective computing solutions based on requirements.
Responsible to architect a framework that is more readily available and demonstrate ease of use. When factoring new architecture make build v/s buy decision and consider cost aspects.
Work in coordination with other internal teams to ensure the infrastructure fully and effectively supports current and planned application systems.
Troubleshoot OS, Networking, Storage, and Software issues while leveraging internal teams for solutions.
Deliver changes to the HPC production platforms according to the change control process. Communicating and seeking approvals from business owners.
Practice network asset management, including maintenance of network component inventory and related documentation
Develop tools to deploy, manage, monitor, and troubleshoot HPC systems at scale.
Maintain asset lists of all servers, applications and licensing ensuring compliancy.
Maintain security standards according to internal policies.
Execute the day-to-day activities of the Incident Management process
Manage and respond to tickets/requests in accordance with SLA timeframes.
Develop tools to deploy, manage, monitor, and troubleshoot HPC systems at scale.

Required Experience:

8-10 years of hands-on systems administration/engineering experience with Linux.
Experience with high performance computing systems in Life Sciences will be added advantage, Engineering, Manufacturing or Financial Services
Minimum three years with Amazon Web Services (AWS) cloud computing.
Extensive administration experience in GPU-based platforms.
Excellent written and oral communication skills and ability to work with people at every level.

Required Skills:

Demonstrated experience in optimizing computing performance and measurement.
Comprehensive knowledge of security compliance and security control.
Proficient skills in shell scripting, Ruby, Perl or Python.
Excellent organization and time management skills and ability to identify priorities to accomplish a variety of tasks simultaneously.
Comprehensive knowledge in Configuration Management (CM) process and software development tools such as Git, GitLab, Nexus, Jenkins, Maven or JIRA.
Working knowledge of HPC schedulers and distributed/parallel file systems, underlying IT systems, and the HPC development process, high throughput and tight coupling approaches
Knowledge of statistics, numeric modeling, data analyzing and machine learning.
AWS certification at Professional level.
Experience with cloud CLI and SDK.
An understanding of the cloud computing delivery model as it relates to HPC
Knowledge of the underlying infrastructure requirements such as Networking, Storage, and Hardware Optimization.
Experience in a customer-facing, sales-aligned role such as consultant, solutions engineer or solutions architect
Track record of implementing AWS services in a variety of business environments such as large enterprises and start-ups.
AWS Certification, eg. AWS Solutions Architect Associate
Understanding of application, server, and network security
Experience in DevOps tools like Ansible Tower, Bitbucket, Terraform, and CloudFormation etc.
Experience in Linux Administration various distributions like Redhat, Amazon, CentOS
Experience with job schedulers like Grid Engine, LSF, PBS, SLURM, Torque, Symphony, TIBCO.
Experience with compilers and libraries such as MPI, GCC, CUDA etc.
Experience with scripting (bash, Python, PowerShell, etc.).
Experience in Filesystem's like NFS, Lustre/GPFS, etc.,
Experience in Application installations and troubleshooting on HPC Clusters based on CPU, GPU.

Certifications (Desirable):

AWS Administrator Professional or up
Linux Administration

Optional Skills:

Docker, Singularity, Kubernetes, Google Cloud Platform will be a plus.
Knowledge of distributed computing
Ansible, Jira, Confluence, Service Now, Excel, Presentation Skills
Worked on building clusters with individual machines (not a service like EMR etc

Vacancy expired!

ID	#15167087
State	Massachusetts
City	Holliston
Job type	Permanent
Salary	USD Depends on Experience Depends on Experience
Source	Atlantic Partners
Showed	2021-06-06
Date	2021-05-21
Deadline	2021-07-20
Category	Et cetera
Create resume

Job Details

HPC Architect