Senior Software Engineer
Company: Microsoft Corporation
Location: Baltimore
Posted on: February 15, 2025
Job Description:
Want to impact the foundation for future AI storage development
in Azure, the world's computer? The Azure Managed Lustre File
System (AMLFS) team leads development, deployment, and monitoring
of the most popular High-Performance Computing (HPC) parallel file
system in the world: Lustre. The AMLFS Platform Team is responsible
for end-to-end delivery of AMLFS images, cluster deployment, logs
and metrics, and configuration compliance.As a Senior Software
Engineer in the AMLFS Platform team you'll be responsible for
developing the reliable deployment of AMLFS in Azure, assessing and
mitigating security risks, developing comprehensive unit and
system-level tests, and diagnosing, mitigating, and fixing the most
challenging deployment and upgrade customer issues. You'll design
and develop logging, monitoring, and reporting capabilities for
AMLFS and help define and measure key Service Level Indicators
designed to make our product increasingly robust. This opportunity
will allow you to develop expertise in distributed system design,
grow proficient in navigating and managing Linux operating systems,
and collaborate with the core storage, compute, and networking
teams that form the foundation of Azure.Responsibilities
- Collaborates with appropriate stakeholders to determine user
requirements for a scenario.
- Drives identification of dependencies and the development of
design documents for a product, application, service, or
platform.
- Creates, implements, optimizes, debugs, refactors, and reuses
code to establish and improve performance and maintainability,
effectiveness, and return on investment (ROI).
- Leverages subject-matter expertise of product features and
partners with appropriate stakeholders (e.g., project managers) to
drive a workgroup's project plans, release plans, and work
items.
- Acts as a Designated Responsible Individual (DRI) and guides
other engineers by developing and following the playbook, working
on call to monitor system/product/service for degradation,
downtime, or interruptions, alerting stakeholders about status and
initiates actions to restore system/product/service for simple and
complex problems when appropriate.
- Proactively seeks new knowledge and adapts to new trends,
technical solutions, and patterns that will improve the
availability, reliability, efficiency, observability, and
performance of products while also driving consistency in
monitoring and operations at scale.QualificationsRequired
Qualifications:
- Bachelor's Degree in Computer Science or related technical
field AND 4+ years technical engineering experience with coding in
languages including, but not limited to, C, C++, C#, Java,
JavaScript, or Python
- OR equivalent experience.
- 2+ years working, developing, and debugging within a Linux
operating system environment and at least broad understanding of
Linux fundamentals.
- 2+ years experience with high-performance computing OR
distributed systems in an industry or academic setting.Other
Requirements:
- Ability to meet Microsoft, customer and/or government security
screening requirements are required for this role. These
requirements include, but are not limited to the following
specialized security screenings:
- Microsoft Cloud Background Check: This position will be
required to pass the Microsoft Cloud Background Check upon
hire/transfer and every two years thereafter.Preferred
Qualifications:
- Bachelor's Degree in Computer Science or related technical
field AND 8+ years technical engineering experience with coding in
languages including, but not limited to, C, C++, C#, Java,
JavaScript, or Python
- OR Master's Degree in Computer Science or related technical
field AND 6+ years technical engineering experience with coding in
languages including, but not limited to, C, C++, C#, Java,
JavaScript, or Python
- OR equivalent experience
- 4+ years of experience with high-performance computing OR
distributed systems in an industry or academic setting.
- Experience with the Lustre parallel file system OR an
equivalent parallel or distributed file system.
- 4+ years of working, developing, and debugging within a Linux
operating system environment.
#J-18808-Ljbffr
Keywords: Microsoft Corporation, Baltimore , Senior Software Engineer, IT / Software / Systems , Baltimore, Maryland
Didn't find what you're looking for? Search again!
Loading more jobs...