Announcement

Announcing Proximal

Feb 18, 2026

Today, we are announcing Proximal. We are excited about a world in which coding agents can autonomously run for multiple weeks, solve the hardest technical problems and discover novel ideas that advance progress in various domains of science and engineering. We believe that we are not far from this future, but that the biggest bottleneck preventing us from achieving it is training data.

Many companies work on training data, but almost all of them are approaching it the wrong way. Historical capability breakthroughs were the result of creative engineers discovering scalable data collection methods in specific domains, rather than thousands of contractors manually writing task demonstrations and graders. Inevitably, the potential impact of human data will become smaller and smaller as model capabilities increase: agents are already outperforming most humans in many domains — the number of experts that are capable of judging model outputs shrinks with every new model release.

To name examples: LLMs are making significant progress in GPU Kernel engineering because KernelBench showed us that we can scalably generate kernel optimization problems from scratch. We are seeing breakthroughs in AI for theorem proving because Lean gives us a playground where mathematical proofs can be formally verified without human grading. CodeI/O and Synthetic-1 showed that we can improve models’ abilities to reason about code through synthetic output prediction tasks.

These are just a few examples of great ideas — many more were discovered throughout the last few years, and significantly more are yet to be discovered. We believe that the process of discovering and executing on these ideas can be drastically accelerated by a data engine that shares infrastructure, abstractions, and learnings across domains, rather than reinventing the stack from scratch in dozens of isolated efforts.

Proximal is a new data company. Our core belief is that data which is complex enough to teach today’s frontier models is not bottlenecked by domain experts, but by great ideas and excellent software. We are not a recruiting firm or a talent marketplace, but a research and engineering organization that treats data as a problem which deserves the same level of rigor as work on training algorithms and model architectures. We think that this is the most impactful work towards agents that can autonomously solve complex technical problems, and intend to share our research and progress in the open.

Our Work

There is a lot of work to be done for us to reach our goal. Here are some of the things we’re thinking about right now:

Engineering. To produce data at scale, there are a lot of hard engineering challenges we need to solve. Some concrete technical challenges we are dealing with:

Automation. With every new model generation, there is more to automate and systemize. We experiment heavily and figure out how to automate more of our work with every new model release — synthetic code generation, rubric generation, agentic QA, and more.
Scaling Agent Infrastructure. Both for building and evaluating our data, we run a huge number of agents in parallel. We have many infrastructure challenges to solve to scale agent runs both horizontally (millions of runs in parallel) and vertically (12+ hour agent runs).
Synthetic Data Generation. We use synthetic data in many different parts of our data generation and want to scale our methods to generate entire codebases from scratch. This requires us to essentially simulate months of collaborative software development in an automated pipeline by orchestrating multi-agent systems that run for 30+ days in parallel.
Processing Large Amounts of Data. More than 100M PRs are created and merged every year publicly. This is a large ongoing stream of data that is growing rapidly.

Open data research. As an independent company, we are incentivized to share our research in the open and help the public understand the capabilities of frontier agents from different vendors. Some concrete questions we want to tackle:

Training for Ultra-Long Horizon Tasks. As we experiment with extremely difficult coding tasks, we need to figure out how to train agents on tasks that would usually take days or weeks to solve. We are particularly interested in multi-agent systems and jointly training master and subagents.
Reward Hacking. The better coding agents become, the more sophisticated and creative they will become at reward hacking. We are interested in methods that detect unwanted behavior during training, and — even more so — stress-test environments to uncover potential reward hacks before starting a training run.
Fuzzy Verifiers. While deterministic tests can measure if AI-generated code is functionally correct, they cannot assign rewards based on aspects such as code quality and long-term maintainability. The holy grail of RL is a fuzzy verification agent that can assign rewards for any coding task, eliminating manual work in verifier design.

Join Us

We are a team of engineers and researchers from companies like Cursor, Prime Intellect and Jane Street — our team members have published papers at leading research conferences, built highly popular open source software and successfully sold their own companies before.

We want to work with engineers and researchers that are excited about pushing the frontier of AI capabilities. If you would like to talk, please reach out to hiring@proximal.ai.

← Back to Blog