State of Open Source AI on Hugging Face: Spring 2026 Deep Dive

📌 Key Takeaways

  • Explosive Growth: Hugging Face reached 13M users, 2M+ models, and 500K+ datasets in 2025
  • China’s Dominance: Chinese models now account for 41% of all downloads, surpassing the US
  • Individual Power: Independent developers rose from 17% to 39% of all downloads
  • Robotics Boom: Robotics datasets grew from 1,145 to 26,991, becoming the largest category
  • Small Model Preference: Despite larger models, median size remains 406M parameters

The Open Source AI Explosion

If you’ve been paying attention to the AI landscape, you’ve probably noticed something remarkable happening: open source AI isn’t just growing—it’s absolutely exploding. The latest report from Hugging Face paints a picture of an ecosystem that has fundamentally transformed over the past year, and the numbers are genuinely staggering.

We’re talking about 13 million users, more than 2 million public models, and over 500,000 public datasets as of 2025. That’s nearly double the activity we saw just a year ago. But here’s what makes this growth even more interesting: it’s not just about consumption anymore. People aren’t just downloading pre-trained models and calling it a day—they’re actively creating, fine-tuning, adapting, and building entirely new applications.

Think about what this means for a moment. We’re witnessing the democratization of AI in real time. The barriers to entry that once kept AI development locked behind corporate walls are crumbling, and individual creators are stepping up to fill the gap with remarkable innovation and creativity.

Of course, this growth comes with its own complexities. The ecosystem remains highly concentrated—about half of all models on Hugging Face have fewer than 200 downloads, while the top 200 models (just 0.01% of all models) account for nearly 50% of total downloads. It’s a classic example of the power law distribution that governs many digital ecosystems.

China’s Strategic Shift to Open Source

Perhaps the most dramatic shift in the landscape has been China’s wholesale embrace of open source AI. This wasn’t a gradual evolution—it was more like flipping a switch. The catalyst? DeepSeek’s viral R1 model release in January 2025, which essentially announced to the world that China was serious about competing in the open source arena.

The numbers tell an incredible story. Chinese models now account for 41% of all downloads on Hugging Face, surpassing the United States for the first time. This isn’t just about one or two standout models—it represents a systematic shift by major Chinese technology companies toward open development.

Consider Baidu’s transformation: they went from exactly zero releases on Hugging Face in 2024 to over 100 releases in 2025. ByteDance and Tencent each increased their releases by eight to nine times. Even companies that had previously favored closed approaches, like MiniMax, pivoted decisively toward open releases.

What’s driving this shift? It’s likely a combination of strategic positioning, regulatory considerations, and recognition that open source can accelerate innovation cycles. When you’re competing with well-funded Western companies, open source becomes a powerful lever for building ecosystem adoption and reducing barriers for developers who might otherwise default to established platforms.

The ripple effects extend far beyond just model releases. Chinese organizations are also investing heavily in the infrastructure to support open source AI, including domestically developed chips optimized for running open models locally.

Corporate Adoption and Big Tech Investment

While individual creators are making waves, corporate adoption of open source AI has reached unprecedented levels. Over 30% of Fortune 500 companies now maintain verified accounts on Hugging Face—a clear signal that open source AI has moved from experimental to essential for large enterprises.

The corporate adoption story isn’t just about using existing models; it’s about active contribution and ecosystem building. NVIDIA has emerged as the strongest contributor among big tech companies, with a dramatic increase in repository growth over time. This makes perfect sense when you consider that NVIDIA’s business model thrives on widespread AI adoption, regardless of whether the models are open or closed.

What’s particularly interesting is how startups are integrating open source into their core offerings. Companies like Thinking Machines built their Tinker model options entirely on open weights, while popular development environments like VSCode and Cursor now support both open and closed models seamlessly. This hybrid approach is becoming the norm rather than the exception.

The trend extends to established companies as well. Organizations like Airbnb have significantly increased their engagement with the open ecosystem, and Hugging Face has seen substantial growth in enterprise subscription upgrades throughout 2025. This suggests that open source AI has proven its value in production environments, not just research contexts.

From an economic perspective, studies suggest that the downstream value created by open AI artifacts far exceeds the cost of producing them—a dynamic that’s reminiscent of traditional open source software but amplified by AI’s broader applicability across industries.

Transform your complex data and research reports into engaging, shareable experiences that your team and stakeholders will actually explore.

Try It Free →

The Geographic Rebalancing of AI Power

The geographic landscape of AI development is undergoing a fundamental rebalancing, and the implications extend far beyond simple market share statistics. What we’re seeing is a shift from US-dominated AI development toward a more multipolar ecosystem where different regions contribute in distinct but complementary ways.

Looking at the historical data, the US and China have been the clear frontrunners, with the UK, Germany, and France serving as important secondary contributors. However, 2025 marked the year when Chinese models not only caught up but actually surpassed US models in terms of monthly downloads and overall adoption.

Here’s what makes this shift particularly significant: it’s not just about individual companies or models. We’re seeing entire ecosystems develop around different geographic centers of AI innovation. The United States and Western Europe continue to dominate through large industry labs like Google, Meta, and Stability AI, but China has built a parallel ecosystem focused on rapid iteration, broad accessibility, and deep integration with local infrastructure.

France, Germany, and the UK maintain their contributions through research organizations and national AI initiatives, often focusing on specialized model families or domain-specific applications. The diversity of contributors and organizational forms tends to produce more widely adopted artifacts, suggesting that healthy competition across different approaches drives better outcomes for the entire ecosystem.

Perhaps most remarkably, individual users without clear organizational affiliations account for about half of all platform downloads. This suggests that geographic boundaries matter less than they once did, and innovation can emerge from anywhere in the global community.

Individual Creators vs Organizations

One of the most democratizing trends in open source AI has been the rising influence of individual creators and small collectives. The numbers here are particularly striking: independent developers have grown from representing 17% of downloads in early years to an impressive 39% by 2025, at times accounting for more than half of total usage.

This isn’t just about hobbyist projects or research experiments. Individual creators are often the ones doing the crucial work of quantizing models for efficient deployment, creating specialized adaptations for specific use cases, and redistributing base models in formats that typical users can actually run on their hardware.

These intermediary creators have become surprisingly influential in steering how innovations spread through the ecosystem. They’re the ones figuring out how to make cutting-edge models accessible to developers who don’t have access to massive GPU clusters, and they’re often the first to identify and address practical deployment challenges that the original model creators might not have considered.

In fact, when looking at trending model development, individual users ranked as the fourth most popular entity for creating new trending models—ahead of many established organizations. This accessibility of competitive model creation at the individual level represents a genuine paradigm shift in how AI innovation happens.

The phenomenon extends beyond just the US and China as well. Countries like France and South Korea have seen particularly strong contributions from individual creators, suggesting that this democratization trend is truly global in scope.

What’s driving this trend? Partly it’s the increasing availability of powerful base models that individuals can fine-tune and adapt. But it’s also the maturation of tools and platforms that make sophisticated AI development accessible to people who might not work for large tech companies but have domain expertise and creative ideas to contribute.

Model Popularity and Community Preferences

Understanding what the community actually likes and uses tells us a lot about where open source AI is heading. The “most liked” models on Hugging Face serve as a fascinating barometer of community attention and preferences, even if they don’t always perfectly correlate with actual usage patterns.

The transformation over the past year has been dramatic. Twelve months ago, the most liked models were predominantly from the US, led by Meta’s Llama family. By 2026, the landscape had shifted to an international mix, with China’s DeepSeek-R1 sitting at the very top of the popularity rankings.

This shift reflects more than just changing preferences—it represents the community’s recognition of technical innovation and practical value. DeepSeek’s success wasn’t just about good marketing; it delivered genuine capabilities that developers found useful in their work.

When we look at scientific contributions, the pattern becomes even more interesting. The most upvoted papers on the platform come predominantly from large organizations, with a notable concentration of Chinese Big Tech companies. ByteDance, in particular, has been sharing a remarkably high volume of high-impact research papers.

However, when we examine Hugging Face’s Daily Papers—a curated collection that tends to highlight work with strong open source adoption—the picture becomes more diverse. Medical applications emerge as particularly influential, while Big Tech’s direct influence appears more sparse in favor of specialized research groups and domain experts.

This suggests that community preferences operate on multiple levels: people appreciate and recognize breakthrough research from major labs, but they’re most likely to actually adopt and build upon work that comes from more specialized, domain-focused efforts.

Turn your static presentations and documents into dynamic, interactive experiences that capture attention and drive engagement.

Get Started →

The Rise of Smaller, Deployable Models

Here’s something that might surprise you: despite all the headlines about ever-larger models with billions or trillions of parameters, the practical reality is that smaller models are dominating actual adoption and deployment. This trend reveals a lot about what developers really need versus what captures media attention.

The data shows an interesting divergence between mean and median model sizes. The mean size of downloaded models rose dramatically from 827M parameters in 2023 to 20.8B parameters in 2025. But the median? It increased only marginally, from 326M to 406M parameters. This tells us that while some users are pulling up the average with very large models, the typical user is still working with relatively compact, deployable systems.

This preference for smaller models isn’t just about convenience—it reflects real-world constraints around cost, latency, and hardware availability. Most developers and organizations need models that can actually run on accessible hardware without breaking the budget or introducing unacceptable delays into their workflows.

The implications extend beyond just practical deployment. Smaller models are downloaded and deployed at far higher rates than massive systems, partly because more small models are released, but also because they’re simply more usable for most applications. Even when accounting for the higher volume of small model releases, analysis from the ATOM Project shows that median top-10 models in the 1-9B parameter range are downloaded only about 4x more than models above 100B parameters—suggesting that size alone isn’t the dominant factor in adoption.

What’s particularly interesting is how quickly performance gaps between frontier models and smaller systems can narrow through fine-tuning and task-specific adaptation. On Hugging Face, models with hundreds of millions of parameters successfully support search, tagging, and document processing workflows, while models in the single-digit billions handle coding, reasoning, and multimodal tasks effectively.

This trend toward capable small models is shifting autonomy closer to the edge, reducing dependency on centralized cloud providers and making AI more accessible to developers and organizations with limited resources. It’s a genuine democratization of AI capabilities.

Specialized Communities: Robotics Revolution

If you want to see the future of open source AI, look at what’s happening in robotics. The growth here isn’t just impressive—it’s absolutely unprecedented. Robotics datasets on Hugging Face grew from 1,145 in 2024 to an astounding 26,991 in 2025, catapulting from rank 44 to become the single largest dataset category on the platform.

To put this in perspective, text generation—the second-largest category—had only around 5,000 datasets in 2025. We’re talking about a community that has grown by more than 20x in a single year, representing a fundamental shift in how people think about AI applications.

The diversity of contributions is remarkable. Community-contributed datasets span everything from household manipulation tasks to autonomous driving scenarios. Large-scale projects like Learning to Drive (L2D), developed through a LeRobot collaboration, provide the kind of multimodal, real-world data that’s essential for training genuinely useful robotic systems.

Projects like RoboMIND, with over 107,000 real-world trajectories across 479 distinct tasks and multiple robot embodiments, demonstrate the scale and ambition of this community effort. This isn’t just academic research—it’s the practical foundation for deployable robotic systems.

Hugging Face’s acquisition of Pollen Robotics has opened up open source robotic hardware to industry, academic labs, and everyday hobbyists, creating a complete ecosystem from software to hardware. Meanwhile, LeRobot, Hugging Face’s comprehensive robotics library, has seen its GitHub stars nearly triple over the past year.

What makes the robotics community particularly interesting is how it’s bridging the gap between digital AI and physical world applications. Embodied AI represents a fundamentally different challenge than text or image generation, requiring integration across perception, planning, and control systems.

AI for Science and Research Collaboration

Scientific research represents another area where open source AI is demonstrating remarkable potential for large-scale collaboration and innovation. Unlike traditional scientific publishing, which can be slow and siloed, the open source AI ecosystem enables rapid sharing, iteration, and building upon each other’s work.

The applications span an impressive range: protein folding, molecular dynamics, drug discovery, and scientific data analysis are all seeing significant activity. What’s particularly exciting is that all the major frontier AI companies now have dedicated science teams, though much of the current focus remains on literature discovery rather than direct experimentation.

The collaborative aspect is where open source really shines in scientific contexts. Community-led projects regularly involve hundreds of contributors working across different institutions and disciplines—something that would be extremely difficult to organize through traditional academic or corporate structures alone.

These efforts highlight how open source serves as more than just a development model; it’s become a mechanism for coordinating large-scale, interdisciplinary research that can tackle problems too complex for any single organization to address effectively.

The Hugging Face science community has created tools like release heatmaps that help researchers track developments across different scientific domains, fostering awareness and collaboration that might not otherwise happen through traditional academic channels.

What’s emerging is essentially a new model for scientific collaboration—one that’s more immediate, more inclusive, and more capable of handling the interdisciplinary challenges that characterize cutting-edge scientific research in the AI era.

Looking Ahead: The Future of Open AI

As we look toward the rest of 2026 and beyond, several key trends seem likely to define the next phase of open source AI evolution. The geographic rebalancing we’ve discussed is likely to accelerate, with Western organizations increasingly seeking commercially deployable alternatives to Chinese models.

This competitive pressure is driving significant investment in projects like OpenAI’s GPT-OSS, AI2’s OLMo, and Google’s Gemma—all attempting to offer competitive open alternatives from US and European developers. Whether these efforts can match the adoption momentum of Qwen and DeepSeek will be one of the defining questions of 2026.

The growth of specialized communities in robotics and scientific research suggests that open source AI is expanding well beyond language and image generation into physical and experimental domains. The infrastructure, norms, and coordination mechanisms developed around text and image models are being adapted for entirely new modalities and use cases.

AI sovereignty initiatives are becoming increasingly important, with governments recognizing that control over AI infrastructure and capabilities has strategic implications. South Korea’s National Sovereign AI Initiative, Switzerland’s Swiss AI project, and various EU-funded efforts all reflect this trend toward national or regional AI capabilities.

The trend toward smaller, more deployable models is likely to continue, driven by practical constraints and the recognition that most real-world applications don’t actually require the largest possible models. Edge deployment and reduced dependency on centralized cloud providers will probably become even more important as the ecosystem matures.

Perhaps most importantly, the rise of individual creators and small teams suggests that innovation in AI is becoming increasingly democratized. The traditional model where breakthrough AI capabilities could only come from well-funded corporate research labs is clearly breaking down.

Make your research and analysis come alive with interactive visualizations and explorable data that tells a compelling story.

Start Now →

For anyone working with AI—whether you’re a researcher, developer, or business leader—these trends suggest that staying connected to the open source ecosystem isn’t just beneficial, it’s essential. The most significant innovations and practical advances are increasingly happening in open, collaborative environments rather than behind closed doors.

The open source AI ecosystem continues to serve as the foundational layer for building, evaluating, and governing AI systems across industries and applications. With the rise of AI agents and increasing focus on interoperability, open source will likely become even more critical as the infrastructure that enables different AI systems to work together effectively.

What’s clear from the Spring 2026 state of open source AI is that we’re not just witnessing incremental growth—we’re seeing the emergence of an entirely new paradigm for how AI development, deployment, and collaboration happen. And this transformation is just getting started.

Frequently Asked Questions

How has China’s role in open source AI changed?

China dramatically shifted toward open source following DeepSeek’s viral R1 release in January 2025. Chinese models now account for 41% of all downloads on Hugging Face, with organizations like Baidu increasing from zero releases to over 100 in 2025, and ByteDance and Tencent each increasing releases by eight to nine times.

What are the fastest-growing AI communities on Hugging Face?

Robotics has emerged as the fastest-growing community, with datasets growing from 1,145 in 2024 to 26,991 in 2025, making it the single largest dataset category. AI for science is another rapidly growing area, with community-led projects involving hundreds of contributors working on protein folding, drug discovery, and scientific data analysis.

Are smaller AI models more popular than large ones?

Yes, smaller models dominate adoption due to practical constraints around cost, latency, and hardware availability. While the mean model size increased to 20.8B parameters in 2025, the median only rose to 406M parameters, indicating that most users still prefer deployable, efficient models over massive frontier systems.

What role does AI sovereignty play in open source development?

AI sovereignty has become increasingly important, with countries launching national initiatives to develop domestic AI capabilities. South Korea’s National Sovereign AI Initiative, Switzerland’s Swiss AI project, and various EU-funded efforts reflect governments’ desire to reduce reliance on foreign-controlled systems and maintain local control over AI infrastructure.

How concentrated is the open source AI ecosystem?

Despite rapid growth to 13 million users and 2 million models, the ecosystem remains highly concentrated. Half of all models have fewer than 200 downloads, while the top 200 models (0.01% of all models) comprise 49.6% of total downloads. However, specialized communities often show sustained engagement despite modest overall download counts.

Your documents deserve to be read.

PDFs get ignored. Presentations get skipped. Reports gather dust.

Libertify transforms them into interactive experiences people actually engage with.

No credit card required · 30-second setup

Our SaaS platform, AI Ready Media, transforms complex documents and information into engaging video storytelling to broaden reach and deepen engagement. We spotlight overlooked and unread important documents. All interactions seamlessly integrate with your CRM software.