DevOps/SRE
Shelf
About Shelf
There is no AI Strategy without a Data Strategy. Getting GenAI to work is mission-critical for most companies, but 90% of AI projects haven't deployed. Why? Poor data quality—it’s the #1 obstacle companies face getting GenAI into production.
Shelf unlocks AI readiness. We provide the core infrastructure that enables GenAI to be deployed at scale. We help companies deliver more accurate GenAI answers by eliminating bad data in documents and files before they go into an LLM and create bad answers.
We’re partnered with Microsoft, Salesforce, Snowflake, Databricks, OpenAI and other leaders bringing GenAI to the enterprise. Our mission is to empower humanity with better answers everywhere.
About the role
The Platform Engineering team works across the stack to give product teams paved, secure, cost-efficient paths to build, ship, and run software with minimal cognitive load. We own the "how," so product teams can focus on the "what." You will join the team responsible for running the core infrastructure that supports Shelf products. This role is primarily based in our European offices in Wroclaw, Poland and Lviv, Ukraine. We will prioritize candidates who are already in Wroclaw or are open to relocating there, as we believe in the value of in-person collaboration to foster strong relationships and seamless communication within our team.
In certain specific situations, we will also consider remote candidates based in one of the countries listed in this job posting. In any case, we ask all new hires to visit our office for the first week of their onboarding (accommodation and travel covered) and then at least 2 days per month or a week per 2 months.
You will develop reusable components, improve system performance, and create scalable abstractions that accelerate product development across the organization.
You will maintain high standards for reliability and security in your work and in the systems used by other teams.
This is a high-ownership, hands-on engineering role. You will manage everything from Terraform/OpenTofu modules and CI/CD pipelines to SSO permissions and observability tools, with a mandate to build infrastructure that works and keeps working.
You will work with AWS, Datadog, OpenTofu, Snowflake, GitHub, Azure, various LLMs, and many other tools and services.
In this role, you will
- Write and maintain infrastructure as code in OpenTofu, making modules more reusable and robust so that more engineers can ship infrastructure safely on their own.
- Write clear runbooks and playbooks that explain how things work and what to do when they break. You present your work in a clean, structured way, prefer writing a good doc once to enable self-serve, and treat every question as a signal to either document the answer or automate it so it does not need to be asked again.
- Care deeply about the health of our infrastructure by keeping databases, LLMs, and third-party self-hosted services on current, supported versions, standardizing them across environments, and actively hunting down and removing outdated components instead of tolerating an aging tech stack.
- Participate in on call rotations and incident response, and write clear postmortems with concrete action items. You enjoy turning every incident into an opportunity to improve, define and refine SLOs and error budgets, and then follow through on the work that prevents repeats, tightens detection, speeds up response, and makes recovery cleaner.
- Treat CI/CD pipelines as a critical product. Own and improve hundreds of pipelines by making them faster, more reliable, easier to roll back, and more standardized so they reduce manual toil and mental overhead for developers.
- Become a Datadog and observability expert, tuning logging, metrics, tracing, dashboards, and alerts to squeeze out as much useful signal as possible. Build simple defaults, automation, and clear docs so developers can self serve, contribute to observability, and rely on a solid platform rather than on ad hoc help from you.
- Make thoughtful build vs buy decisions and work directly with vendors and cloud support (AWS, Azure, GCP, and others) to solve infrastructure problems, plan upgrades, and find cost savings, preferring to ask good questions and pull in expertise rather than silently struggle on your own.
- Implement, and enforce SOC 2 aligned policies for infrastructure and deployments, including disaster recovery and business continuity, change management, and security policies, and ensure they are practical, documented, and followed in day to day work.
You might thrive in this role if you
- Take pride in building and operating scalable, reliable, secure systems and are not comfortable bypassing protocols or cutting corners.
- Take full ownership of your work, handle ambiguity and rapid change, and proactively remove obstacles to deliver results.
- Are comfortable diving into any part of the stack, from infrastructure and backend services to product frontends, when that is what it takes.
- Use Python to automate repetitive work and improve your own and others’ workflows.
- Read AWS re:Invent announcements and us-east-1 post-mortems for fun.
Sample projects
- Provision the entire product infrastructure and applications in a new cloud or region.
- Design and implement a live database migration from us-east-1 to us-east-2.
- Maintain a 100% score on the AWS CIS Benchmark in our environments.
- Centralize audit trail logs from AWS, GCP, and Azure into a single place.
- Write a clear runbook describing how you conducted a disaster recovery test of a system component.
- Change the SSO provider and reconfigure services to use the new provider.
- Optimize infrastructure costs by improving configuration, identifying abandoned resources, and applying reserved or committed compute purchases where appropriate.
What Shelf Offers
- B2B contract
- Company Stock Options
- Hardware: MacBook Pro
- Modern technical stack. Develop open-source software
- Premier AI development environment: GitHub Copilot, Claude Code, OpenAI, TypingMind, v0, MCP Servers, plus credits to experiment with emerging AI tools
Why Shelf
- Leadership with deep knowledge management, AI, and enterprise SaaS expertise
- Customers love us for innovative capabilities, reliability, and measurable business impact
- $60M+ raised from top-tier investors including Tiger Global, Insight Partners, and Base10
- High-velocity growth, tripling year over year for three consecutive years
- 100+ employees across the U.S. and Europe with ambitious hiring plans
Our Values:
Quality - We’re united by our focus on world‑class Quality. Quality in all things – starting with everything that leaves your desk. Everything you touch – every email, report, campaign, and piece of code – should be outstanding. Your work product should blow people away. Having people look at what you’ve done and say, “Wow.” That’s the standard here. Remember that how you do anything is how you do everything. Focus on craftsmanship—your ability to make things better.
Momentum - for us means that you should know that the things you’re responsible for are moving forward. When you look around and see something that’s stalled, get it moving again. We pride ourselves on “ball movement.” When your boss or team leaves you with something, they should return to see measurable progress. Small, continuous movement is our recipe for success. Constantly look for how to make the work around you move forward. We want you to initiate solutions, ideas, and progress. Don’t wait for it to come to you—reach out and create movement. All the time.
Accountability - We expect every team member to feel that they are accountable for more than anyone might normally expect. Each of us should feel real responsibility for things even at the edge of our control. We consistently share and align on expectations, give each other open and respectful feedback, and use those two drivers to ensure that every agreement we make with one another is clear and complete.
Hard Work - We’re here to do something difficult together. We care intensely about the mission and we expect that from our teammates. That care means that we work hard here. Hard work comes with long hours, extra effort…and real opportunity at Shelf. Your passion for creating and sustaining output is a part of our DNA. Support each other, cheer each other on, drive the mission forward. Great teams sustain intense effort together to win.
Learning Agility - We’re innovating in one of the fastest‑moving spaces in history at a time of accelerating global change. That’s incredibly exciting and requires each of us to commit fully to learning each and every day so that we can be the best at what we do. None of us know everything. All of us can learn anything. Staying open and constantly curious is a key success driver at Shelf. It also requires humility. We prize people who are consistently humble and open to making mistakes and growing from them. Recognize also that learning itself is a skill…we need you to be really good at it. Keep dialing in your own understanding about how you learn best and push yourself to keep growing.
Adapt and Thrive - Overcoming challenges lives deep in our DNA. We have a proud history of understanding and living the reality that obstacles are our opportunities…they’re the key to our success. Change is a constant in our business and fighting change is counterproductive. We need you to be good at being uncomfortable and understand that discomfort is the key to growth. Cultivate your own ability to adapt and know that struggling well is something you’ll share with every team you’re on at Shelf. Our company stories are about thriving through real difficulty…together.
Win Together - We win or lose as a team. Always. Everything you do here is connected to the rest of the organization. Part of our shared team environment demands full honesty…real candor and directness with one another. We expect you to constantly be thinking about how to support your teammates and the company, always acting in service to our shared mission and what’s best for the organization as a whole.