ML Infrastructure Engineer (Staff/Senior)
Company: Abridge AI Inc.
Location: San Francisco
Posted on: November 6, 2024
Job Description:
Abridge was founded in 2018 with the mission of powering deeper
understanding in healthcare. Our AI-powered platform was
purpose-built for medical conversations, improving clinical
documentation efficiencies while enabling clinicians to focus on
what matters most-their patients.Our enterprise-grade technology
transforms patient-clinician conversations into structured clinical
notes in real-time, with deep EMR integrations. Powered by Linked
Evidence and our purpose-built, auditable AI, we are the only
company that maps AI-generated summaries to ground truth, helping
providers quickly trust and verify the output. As pioneers in
generative AI for healthcare, we are setting the industry standards
for the responsible deployment of AI across health systems.We are a
growing team of practicing MDs, AI scientists, PhDs, creatives,
technologists, and engineers working together to empower people and
make care make more sense.The RoleAs an ML Infrastructure Engineer
at Abridge, you will be responsible for scaling and deploying
machine learning models to handle increasing traffic demands and
integrating them with various platforms. You'll play a pivotal role
in building a scalable infrastructure that not only supports
current deployments but also lays the foundation for long-term
growth. Your role will be critical in ensuring our AI-driven
healthcare platform is powered by robust, scalable, and efficiently
deployed models.What You'll Do
- Architect, design, and implement ML software systems for
deploying and managing models at scale.
- Stand up ML models for inference, starting with critical models
like the 'linkages' model, and ensure they are capable of handling
traffic increases.
- Develop and maintain infrastructure that supports efficient ML
operations, including model evaluations, deployments, and training
at scale.
- Collaborate closely with ML researchers, engineers, and
cross-functional teams to ensure seamless integration of models
with services like Zoom and Athena.
- Work with stakeholders across machine learning and operations
teams to iterate on systems design and implementation.
- Optimize and maintain the performance of ML systems to ensure
high availability, fault tolerance, and smooth scalability.
- Troubleshoot production issues and continuously improve systems
to enhance performance and efficiency.What You'll Bring
- 5+ years of experience in ML model deployment and scaling, with
a focus on production-quality software.
- Strong proficiency in Python and Kubernetes, with experience
building scalable ML infrastructure.
- Expertise in designing fault-tolerant, highly available
systems.
- Experience working with cloud environments, Infrastructure as
Code (IaC), and managing deployments using Kubernetes.
- Proficiency in optimizing system performance, debugging
production issues, and designing systems for scalability and
security.
- Experience in software design and architecture for highly
available machine learning systems for use cases like inference,
evaluation, and experimentation.
- Excellent understanding of low-level operating systems
concepts, including multi-threading, memory management, networking
and storage, performance, and scale.
- Bachelor's/Master's Degree or greater in Computer
Science/Engineering, Statistics, Mathematics, or equivalent.
- Excellent interpersonal and written communication
skills.Ideally, You Have
- Experience with large-scale ML platforms like Ray, Databricks,
or AnyScale.
- Expertise with ML toolchains such as PyTorch or
TensorFlow.
- Proven experience working with distributed systems and handling
inference at scale.
- Background in working with teams and leaders to deliver
impactful ML-powered solutions in fast-paced environments.
- Demonstrated experience incubating and productionizing new
technology, working closely with research scientists and technical
teams from idea generation through implementation.Why Work at
Abridge?
- Be a part of a trailblazing, mission-driven organization that
is powering deeper understanding in healthcare through AI!
- Opportunity to work and grow with talented individuals and have
ownership and impact at a high-growth startup.
- Flexible/Unlimited PTO - Salaried team members can take off as
much approved time off as they need, plus 13 paid holidays.
- Equity - For all salaried team members.
- Medical insurance - We pay 100% of the premium for you + 75%
for dependents. 3 Aetna plans to choose from.
- Dental & Vision insurance - We pay 100% of the premium for you
+ 75% for dependents. 2 Aetna plans to choose from.
- Flexible Spending (FSA) & Health Savings (HSA) Accounts.
- Learning and Development budget - $3,000 per year for coaching,
courses, workshops, conferences, etc.
- 401k Plan - Contribute pre-tax dollars toward retirement
savings.
- Paid Parental Leave - 16 weeks paid parental leave for all
full-time employees.
- Flexible working hours - We care more about what you accomplish
than what specific hours you're working.
- Home Office Budget - We provide up to $1,600 in a one-time
reimbursement to set up your home office.
- Sabbatical Leave - 30 days of paid Sabbatical Leave after 5
years of employment.
- ...Plus much more!Diversity & InclusionAbridge is an equal
opportunity employer. Diversity and inclusion is at the core of
what we do. We actively welcome applicants from all backgrounds
(including but not limited to race, gender, educational background,
and sexual orientation).
#J-18808-Ljbffr
Keywords: Abridge AI Inc., Parkway-South Sacramento , ML Infrastructure Engineer (Staff/Senior), Accounting, Auditing , San Francisco, California
Didn't find what you're looking for? Search again!
Loading more jobs...