Superhuman Coding Agents

About

Proximal is a research lab for coding data

We are excited about a world in which coding agents can autonomously run for multiple weeks, solve the hardest technical problems and discover novel ideas that advance progress in various domains of science and engineering. We believe that we are not far from this future, but that the biggest bottleneck preventing us from achieving it is training data.

Many companies work on training data, but almost all of them are approaching it the wrong way. Historical capability breakthroughs were the result of creative engineers discovering scalable data collection methods in specific domains, rather than thousands of contractors manually writing task demonstrations and graders. Inevitably, the potential impact of human data will become smaller and smaller as model capabilities increase: agents are already outperforming most humans in many domains - the number of experts that are capable of judging model outputs shrinks with every new model release.

Proximal is a new data company: Our core belief is that data that is complex enough to teach today’s frontier models is not bottlenecked by domain experts, but by great ideas and excellent software. We are not a recruiting firm or a talent marketplace, but a research and engineering organization that treats data as a problem which deserves the same level of rigor as work on training algorithms and model architectures. We think that this is the most impactful work towards agents that can autonomously solve complex technical problems, and intend to share our research and progress in the open.

San Francisco, CA

© 2024