Exclusive: New wallet app lets users store their own training data
Add Axios as your preferred source to
see more of our stories on Google.

Credit: Vana
Vana, a startup spun out of MIT, is rolling out an app that works like a wallet for personal data that can be used to train AI.
Why it matters: Vana hopes people will use the app to control and pool their own data with others, shape how it's used and share in the value it creates.
- More than a million people have contributed data across various pools to date, with typical individual payouts still small — roughly a few dollars, Vana tells Axios.
The big picture: AI systems are currently trained on hard-to-access data that largely sits with a handful of platforms like Meta, Spotify and Netflix.
- Big tech platforms are often required to allow users to download and export their data, but the process is confusing, slow and the output is rarely in a format where it can be used elsewhere.
- Vana acts as an "automation layer" that repeatedly pulls down your data and keeps it usable, in an effort to allow people — not platforms — to reap the value.
How it works: Users can link their Spotify listening history, LinkedIn behavior, Netflix viewing history, Instagram captions, and more to Vana's app, which stores data under the user's keys.
- Users can sign up with Google or through a crypto wallet, but the company says you don't need to understand tokens or crypto to participate.
- Individuals can pool similar data (e.g., music history, sleep metrics, ChatGPT logs) and collectively decide who can run computations on it.
- Vana says contributors are scored on authenticity, ownership, uniqueness and quality of their data, to determine their share when the pool licenses access for training or analytics.
- Rather than handing over raw records, the platforms will run privacy-preserving operations on pooled data, allowing use of the data without exposing individual records.
Reality check: It's one thing to trust your data to tech giants like Spotify, Netflix and LinkedIn. It's another to trust it with a tiny startup few people have heard of.
- Vana says users have complete control and can decide whether and when to decrypt their data using keys they control from their own devices.
- This privacy risk may be a bridge too far for some, especially since the monetary rewards are still low.
The intrigue: AI will soon be everywhere, it seems. For many there's a growing feeling that you either have to get on the AI train or get run over by it.
- The majority of Americans say they want more control over how AI is used in their lives, according to a recent Pew Research poll.
Catch up quick: Anna Kazlauskas, CEO of OpenDataLabs and creator of Vana, told Axios that her interest in creating Vana came from her time training AI models at CSAIL, MIT's AI lab.
- That's where she learned that the only thing that mattered in building better AI was better data.
- In April 2024, Vana launched a method for Reddit users to pool their data and sell it to companies training AI models, or to Reddit itself.
What they're saying: "I think a lot of people are scared of AI replacing them, right?" Kazlauskas says.
- She encourages people to think about the fact that their data is equity in AI. "As we create AI, if my data helped train it, then I should actually own that AI model."
- Users could try to broker deals with platforms directly, but those platforms may not be willing or legally able to sell that user data.
The bottom line: In an era when AI companies are racing to train on whatever publicly available data they can find, Vana is pushing a user-first alternative — one where your own personal training data is portable, private and owned by you.
