Collaborative AI research lab · est. 2017

AI research by brilliant people around the world.

Alphabell is a collaborative multinational research lab. We bet that the next AI breakthroughs will come from sharper, more elegant ideas — not from ingesting more of the internet or building another data center. So we coordinate hundreds of independent researchers, data scientists, and hackers who actually think about this work for a living.

614
Active members
across 41 countries
87
Public releases
papers, datasets, tools · 2025–26
32
Replications shipped
of published AI research
42
Active projects
member-led, open by default

The bet

Many minds, not more compute.

The dominant paradigm in AI today is straightforward: more data, more parameters, more GPUs. It has carried the field a long way. It is also hiding a quieter truth — that "we don't yet have a good idea" keeps getting mistaken for "we don't yet have enough compute".

We think the next round of real breakthroughs will come from people with sharper hypotheses, careful theory, and the kind of focused thinking that doesn't get easier just because you spent another billion dollars. The bottleneck is the idea, not the cluster.

So we don't try to build a frontier lab. We coordinate as many great thinkers as we can find — independent researchers, data scientists, and hackers across dozens of countries — instead of ingesting more of the internet or pouring billions into another data center.

The next AI breakthroughs will come from sharper ideas, not bigger clusters. Many minds beat more compute.

The work we like best is the kind that doesn't get cheaper at scale: a clean hypothesis, a careful replication, an interpretability result that finally explains something, a benchmark that exposes a real failure mode. None of these get unblocked by 10× the GPU budget. They get unblocked when you find the right person and give them runway to think.

That is the entire business model. Find the people. Give them runway. Publish what they find.

Research agenda

The kind of problems where ideas matter more than scale

We pick problems where the bottleneck is the hypothesis, not the cluster — open questions where the cost of being wrong is low, the cost of being right is high, and the work benefits from being done in public. The list below is what we are currently funding most of.

Architectures & training

New model architectures, training methods, and scaling experiments at the edge of what small teams can run.

Mechanistic interpretability

Sparse-feature atlases, causal scrubbing, refusal direction studies. Open dictionaries for open models.

Evaluation methods

Held-out benchmarks, agent-trajectory evals, multilingual prompt sets, blind-spot detection for VLMs.

Agent systems

Long-horizon agent training, schema-aware tool use, replay buffers, and harnesses that don't fall apart on day five.

Dataset experiments

New training data, new ways to study existing data, and underserved-language eval sets built with native speakers.

Reproducibility & audits

End-to-end replications of published work, bug bounties on shipping code, and shared reference numbers.

Recent work

Shipped in the open

A sample of what members have released in the last few months — datasets, audits, atlases, tools. Every release is reproducible by design: code, seeds, configs, and the held-out splits all live next to the paper.

All active projects →

Open science as default

Reproducible by design

We take reproducibility as a design constraint, not a virtue badge to be added at the end. Every project releases enough material for an independent team to verify the result — and we fund the people who try.

Open by default

Code, weights, datasets, and write-ups all release under permissive licenses. Closed-source output is the exception, and we want a reason on the record.

Seeds & configs shipped

No "and we did some hyperparameter tuning". Every release ships the seed, the config, the data split, and the version of the eval harness used to score it.

Replications get paid

We pay members to replicate published work — ours and others'. Negative results are published the same way positive ones are. Outcomes go in the public log either way.

Working on something worth doing in the open?

If it's smaller than a paper but bigger than a tweet, we probably want to host it. Tell us what you're thinking — the form is three short questions and we read everything.

Pitch a project