Open Source

MacFORTH Code for 1984 Robot-Coding Game 'ChipWits' from 1984 is Now Open Source (chipwits.com) 10

Back in the mid-1980s Mark Roth was in 5th grade when the game ChipWits "helped kindle his interest in coding," according to an online biography. ("By middle school, he wrote his first Commodore 64 assembler and by high school he authored a 3D Graphics library for DOS.")

And 40 years later, Slashdot reader markroth8 writes that the programming puzzle/logic game "inspired many people to become professional coders": ChipWits was first released for Mac in 1984, and was later ported to Commodore 64 and Apple II in 1985. To celebrate the game's 40th anniversary, the team behind the new Steam reboot of ChipWits (including its original co-creator Doug Sharp, also of fame for the game King of Chicago) is announcing the recovery and open source release of the original game's source code, written in the FORTH programming language, for both Mac and Commodore 64 platforms.

Recovering data from 40-year old 5.25" and 3.5" disks was a challenge in and of itself, and most of the data survived unscathed! It's interesting to read the 40-year-old code, and compare it to modern game development.

"Our goal for open sourcing the original version of ChipWits is to ensure its legacy lives on," according to the announcement. (It adds that "We also wanted to share an appreciation for what cross-platform software development for 8-bit microcomputers was like in 1984.")
Open Source

GitHub Announces New Open Source Fund with Security Mentoring (techcrunch.com) 2

The GitHub Secure Open Source Fund launched this week with an initial commitment of $1.25 million, reports TechCrunch, using "capital from contributors including American Express, 1Password, Shopify, Stripe, and GitHub's own parent company Microsoft." GitHub briefly teased the new initiative at its annual GitHub Universe developer conference last month, but Tuesday it announced full details and formally opened the program for applicants, which will be reviewed "on a rolling basis" through the closing date of January 7, 2025, with programming and funding starting shortly after...

Tuesday's news builds on a number of previous GitHub initiatives designed to support project maintainers that work on key components of critical software, including GitHub Sponsors which landed in 2019 (and which is powering the new fund), but more directly the GitHub Accelerator program that launched its first cohort last year — the GitHub Secure Open Source Fund is essentially an extension of that.

"We're trying to acknowledge the fact that we're the home of open source, ultimately, and we have an obligation to help ensure that open source can continue to thrive and have the support that it needs," GitHub Chief Operating Officer Kyle Daigle told TechCrunch in an interview. Qualifying projects can be pretty much any project that has an open source license, but of course GitHub will be looking at those that need the funds most — so Kubernetes can hold fire with its application. "We're looking for the outsized impact, which tends to be big projects with few maintainers that we all rely on," Daigle said.

The sum of $1.25 million might sound like a reasonable amount, but it will be split across 125 projects, which means just $10,000 each — better than nothing, for sure, but a drop in the ocean on the grand scheme of things. However, Daigle is quick to stress that money is only part of the prize here — as with the initial accelerator program, maintainers embark on a three-week program, which includes mentorship, certification, education workshops, and ongoing access to GitHub tools.

From GitHub's announcement: Since introducing support for organizations through GitHub Sponsors, more than 5,800 organizations, including Microsoft and Stripe, have invested in maintainers and projects on GitHub, up nearly 40% YoY. Cumulatively, the platform has unlocked over $60 million in funding for maintainers to help them spend more time working on their projects.

But we know we're just scratching the surface when it comes to organizations and corporate support of open source. This summer, we partnered with the Linux Foundation and researchers from Laboratory for Innovation Science at Harvard (LISH) to learn more about the state of open source funding today. Diving in, we assessed organizations funding behaviors, potential misalignments, and opportunities to improve. In the report launched today, we found:


- Responding organizations annually invest $1.7 billion in open source, which can be extrapolated to estimate that approximately $7.7 billion is invested across the entire open source ecosystem annually.

- 86% of investment is in the form of contribution labor by employees and contractors working for the funding organization, with the remaining 14% being direct financial contributions.

- Organizations generally know how and where they contribute (65%) but lack specific clarity of their contributions (38%).

- Security efforts focus on bugs and maintenance; only a few (6%) said comprehensive security audits are a priority.


We all stand to benefit from unlocking more funding for open source. By tackling problems like open source security as an ecosystem, we believe we can help create more available funding and resources that are vital to the sustainability of open source. Not every open source project or maintainer has access to funding and training for security. That's why we created a fund that everyone potentially eligible can apply for...

This is the beginning of a journey into helping find ways to secure open source. On its own, it's not the answer, but we are confident it will help. We will be monitoring the impact of these investments and share what we learn as we go.

Programming

Verify the Rust's Standard Library's 7,500 Unsafe Functions - and Win 'Financial Rewards' (devclass.com) 85

The Rust community has "recognized the unsafety of Rust (if used incorrectly)," according to a blog post by Amazon Web Services.

So now AWS and the Rust Foundation are "crowdsourcing an effort to verify the Rust standard library," according to an article at DevClass.com, "by setting out a series of challenges for devs and offering financial rewards for solutions..." Rust includes ways to bypass its safety guarantees though, with the use of the "unsafe" keyword... The issue AWS highlights is that even if developers use only safe code, most applications still depend on the Rust standard library. AWS states that there are approximately 7.5K unsafe functions in the Rust Standard Library and notes that 57 "soundness issues" and 20 CVEs (Common Vulnerabilities and Exposures) have been reported in the last three years. [28% of the soundness issues were discovered in 2024.]

Marking a function as unsafe does not mean it is vulnerable, only that Rust does not guarantee its safety. AWS plans to reduce the risk by using tools and techniques for formal verification of key library code, but believes that "a single team would be unable to make significant inroads" for reasons including the lack of a verification mechanism in the Rust ecosystem and what it calls the "unknowns of scalable verification." The plan therefore is to turn this over to the community, by posing challenges and rewarding developers for solutions.... A GitHub repository provides a fork of the Rust code and includes a set of challenges, currently 13 of them... The Rust Foundation says that there is a financial reward tied to each challenge, and that the "challenge rewards committee is responsible for reviewing activity and dispensing rewards." How much will be paid though is not stated.

Despite the wide admiration for Rust, there is no formal specification for the language, an issue which impacts formal verification efforts.

Thanks to Slashdot reader sean-it-all for sharing the news.
Open Source

Jim Zemlin, 'Head Janitor of Open Source,' Marks 20 Years At Linux Foundation (zdnet.com) 3

ZDNet's Steven Vaughan-Nichols interviews Jim Zemlin, Executive Director of The Linux Foundation and "head janitor of open source." An anonymous Slashdot reader shares an excerpt from the article: When I first met Zemlin, he was the head of the Free Standards Group (FSG). The FSG's main project was the Linux Standard Base (LSB) project. The LSB's goal was to get everyone in the Linux desktop world to agree on standards to ensure compatibility among distributions and their applications. Oh well, some struggles are never-ending. Another group, the Open Source Development Labs (OSDL), was simultaneously working on standardizing enterprise Linux. The two non-profits had the same goal of making Linux more useful and popular, so they agreed to merge. Zemlin was the natural pick to head this new group, which would be called The Linux Foundation.

At the time, he told me: "The combination of the two groups really enables the Linux platform and all the members of the Linux Foundation to work really effectively. I clearly understand what the organization's charter needs to be: We need to provide services that are useful to the community and industry, as well as protect, promote, and continue to standardize the platform." While initially focused on Linux, the Foundation's scope expanded significantly around 2010. Until then, the organization had hosted about a dozen projects related to the Linux operating system. However, as Linux gained dominance in various sectors, including high-performance computing, automotive, embedded systems, mobile devices, and cloud computing, the Linux Foundation started to broaden its horizons.
Zemlin says there are three words that sum up the Linux Foundation's effort to keep open source safe and open to a new generation of developers: helpful, hopeful, and humble.

"You must be genuinely helpful to developers. We're the janitors of open source. The Linux Foundation takes care of all the boring but important stuff necessary to support software development so developers can focus on code. This work includes events, project marketing, project infrastructure, finances for projects, training and education, legal assistance, standards, facilitation, open source evangelism, and much, much more."

He continued: "The hopeful part is really the optimistic part. When in 2007, people were saying that this would never work. When leaders of huge companies tell everyone that you know all that you're doing is a cancer or terrible, you have to have a sense of optimism that there are better days ahead. You have to always be thinking, 'No, we can do it and stick with it.'"

However, Zemlin concluded that the number one trait that's "important in working in open source is this idea of humility. I work with hundreds of people every day, and none of them work at the Linux Foundation. We must lead through influence, and that really has been the secret for 20 years of working here without going totally insane. If you can check your ego and take criticism, open source actually turns out to be a really fun community to work with."
Open Source

Twenty Is Building an Open Source Alternative To Salesforce (techcrunch.com) 22

An anonymous reader quotes a report from TechCrunch: For the past couple of years, the startup has been iterating on a brand-new CRM platform and making everything available on GitHub under a permissive AGPLv3 license. While Twenty doesn't have all the features that you can find in Salesforce [comparison], the company is slowly building a community of CRM and open source enthusiasts around it, with more than 300 contributors in the last year and 20,000 stars on GitHub. [...] Twenty is trying to build a flexible platform that can be tweaked to every company's needs and that can serve as a basis for other tools and use cases. Each entry in a CRM is an object. It can be a standard, pre-defined object like a person or a company. But customers can also create their own custom objects.

If you're a conference organizer, you can create a conference object. If you're a restaurant chain manager, you can create a restaurant object. As you may have guessed, Twenty also lets you create custom fields for each object. This way, it's easier to capture and compare data across multiple entries. This customer data can be viewed in Twenty directly in list or Kanban views. People can sort and filter entries, add tasks and notes, all the usual CRM stuff. But data in Twenty can also be reused with GraphQL and REST APIs. And that's how you can extend Twenty beyond its CRM roots. Eventually, Twenty hopes there will be an active ecosystem of developers working on extensions and plugins to build a proper alternative to the Salesforce product suite. But we're not there yet. "Building a CRM is a daunting task, especially for us because of the way we've chosen to do it. We're building a platform, and we're not taking any shortcut. In fact, we still need to work on workflows, on automation and more," [said Twenty co-founder and CEO Felix Malfait].
"People often don't understand why Salesforce is so big, so powerful," Malfait said. Salesforce's platform utilizes a flexible data model -- a programming language called Apex to execute code on Salesforce's servers and a front-end customization framework.

"So when you have these three bricks you can store data, do logic on the back end, and display the result as you like," Malfait said. "It means that you can do everything. And that's what we want to enable in the long term."
Transportation

'Automotive Grade Linux' Will Promote Open Source Program Offices for Automakers (prnewswire.com) 28

Automotive Grade Linux is a collaborative open source project developing "an open platform from the ground up that can serve as the de facto industry standard" for fast development of new features. Automakers have joined with tech companies and suppliers to speed up development (and adoption) of "a fully open software stack for the connected car" — hosted at the Linux Foundation, and "with Linux at its core..."

And this week they created a new Open Source Program Office expert group, led by Toyota, to promote the establishment of Open Source Program Offices within the automotive industry, "and encourage the sharing of information and best practices between them." Open source software has become more prevalent across the automotive industry as automakers invest more time and resources into software development. Automakers like Toyota and Subaru are using open source software for infotainment and instrument cluster applications. Other open source applications across the automotive industry include R&D, testing, vehicle-to-cloud and fleet management. "Historically, there has been little code contributed back to the open source community," said Dan Cauchy, Executive Director of Automotive Grade Linux. "Often, this was because the internal procedures or IT infrastructure weren't in place to support open source contributions. The rise of software-defined vehicles has led to a growing trend of automakers not just using, but also contributing, to open source software. Many organizations are also establishing Open Source Program Offices to streamline and organize open source activities to better support business goals."

Automakers including Toyota, Honda, and Volvo have already established Open Source Program Offices. The new AGL OSPO Expert Group provides a neutral space for them to share pain points and collaborate on solutions, exchange information, and develop best practices that can help other automakers build their own OSPOs. "Toyota has been participating in AGL and the broader open source community for over a decade," said Masato Endo, Group Manager of Open Source Program Group, Toyota. "We established an OSPO earlier this year to promote the use of open source software internally and to help guide how and where we contribute. We are looking forward to working with other open source leaders to solve common problems, collaborate on best practices, and invigorate open source activities in the automotive industry."

The AGL OSPO EG is led by Toyota with support from Panasonic and AISIN Corporation.

AI

AI Lab PleIAs Releases Fully Open Dataset, as AMD, Ai2 Release Open AI Models (huggingface.co) 5

French private AI lab PleIAs "is committed to training LLMs in the open," they write in a blog post at Mozilla.org. "This means not only releasing our models but also being open about every aspect, from the training data to the training code. We define 'open' strictly: all data must be both accessible and under permissive licenses."

Wednesday PleIAs announced they were releasing the largest open multilingual pretraining dataset, according to their blog post at HuggingFace: Many have claimed that training large language models requires copyrighted data, making truly open AI development impossible. Today, Pleias is proving otherwise with the release of Common Corpus (part of the AI Alliance Open Trusted Data Initiative) — the largest fully open multilingual dataset for training LLMs, containing over 2 trillion tokens of permissibly licensed content with provenance information (2,003,039,184,047 tokens).

As developers are responding to pressures from new regulations like the EU AI Act, Common Corpus goes beyond compliance by making our entire permissibly licensed dataset freely available on HuggingFace, with detailed documentation of every data source. We have taken extensive steps to ensure that the dataset is high-quality and is curated to train powerful models. Through this release, we are demonstrating that there doesn't have to be such a [heavy] trade-off between openness and performance.

Common Corpus is:

— Truly Open: contains only data that is permissively licensed and provenance is documented

— Multilingual: mostly representing English and French data, but contains at least 1B tokens for over 30 languages

— Diverse: consisting of scientific articles, government and legal documents, code, and cultural heritage data, including books and newspapers

— Extensively Curated: spelling and formatting has been corrected from digitized texts, harmful and toxic content has been removed, and content with low educational content has also been removed.


Common corpus builds on a growing ecosystem of large, open datasets, such as Dolma, FineWeb, RefinedWeb. The Common Pile currently in preparation under the coordination of Eleuther is built around the same principle of using permissible content in English language and, unsurprisingly, there were many opportunities for collaborations and shared efforts. But even together, these datasets do not provide enough training data for models much larger than a few billion parameters. So in order to expand the options for open model training, we still need more open data...

Based on an analysis of 1 million user interactions with ChatGPT, the plurality of user requests are for creative compositions... The kind of content we actually need — like creative writing — is usually tied up in copyright restrictions. Common Corpus tackles these challenges through five carefully curated collections...

Last week AMD also released its first series of fully open 1 billion parameter language models, AMD OLMo.

And last month VentureBeat reported that the non-profit Allen Institute for AI had unveiled Molmo, "an open-source family of state-of-the-art multimodal AI models which outpeform top proprietary rivals including OpenAI's GPT-4o, Anthropic's Claude 3.5 Sonnet, and Google's Gemini 1.5 on several third-party benchmarks."
Patents

Open Source Fights Back: 'We Won't Get Patent-Trolled Again' (zdnet.com) 64

ZDNet's Steven Vaughan-Nichols reports: [...] At KubeCon North America 2024 this week, CNCF executive director Priyanka Sharma said in her keynote, "Patent trolls are not contributors or even adopters in our ecosystem. Instead, they prey on cloud-native adopters by abusing the legal system. We are here to tell the world that these patent trolls don't stand a chance because CNCF is uniting the ecosystem to deter them. Like a herd of musk oxen, we will run them off our pasture." CNCF CTO Chris Aniszczyk added: "The reason trolls can make money is that many companies find it too expensive to fight back, so they pay trolls a settlement fee to avoid the even higher cost of litigation. Now, when a whole herd of companies band together like musk oxen to drive a troll off, it changes the cost structure of fighting back. It disrupts their economic model."

How? Jim Zemlin, the Linux Foundation's executive director, said, "We don't negotiate with trolls. Instead, with United Patents, we go to the PTO and crush those patents. We strive to invalidate them by working with developers who have prior art, bringing this to the attention of the USPTO, and killing patents. No negotiation, no settlement. We destroy the very asset that made patent trolls' business work. Together, since we've started this effort, 90% of the time, we've been able to go in there and destroy these patents." "It's time for us to band together," said Joanna Lee, CNCF's VP of strategic programs and legal. "We encourage all organizations in our ecosystem to get involved. Join the fight, enhance your own company's protection, protect your customers, enhance our community defense, and save money on legal expenses."

While getting your company and its legal department involved in the effort to fend off patent trolls is important, developers can also help. CNCF announced the Cloud Native Heroes Challenge, a patent troll bounty program in which cloud-native developers and technologists can earn swag and win prizes. They're asking you to find evidence of preexisting technology -- referred to by patent lawyers as "prior art" -- that can kill off bad patents. This could be open-source documentation (including release notes), published standards or specifications, product manuals, articles, blogs, books, or any publicly available information. All entrants who submit an entry that conforms to the contest rules will receive a free "Cloud Native Hero" t-shirt that can be picked up at any future KubeCon+CloudNativeCon. The winner will also receive a $3,000 cash prize.

In the inaugural contest, the CNCF is seeking information that can be used to invalidate Claim 1 from US Patent US-11695823-B1. This is the major patent asserted by Edge Networking Systems against Kubernetes users. As is often the case with such patents, it's much too broad. This patent describes a network architecture that facilitates secure and flexible programmability between a user device and across a network with full lifecycle management of services and infrastructure applications. That describes pretty much any modern cloud system. If you can find prior art that describes such a system before June 13, 2013, you could be a winner. Some such materials have already been found. This is already listed in the "known references" tab of the contest information page and doesn't qualify. If you care about keeping open-source software easy and cheap to use -- or you believe trolls shouldn't be allowed to take advantage of companies that make or use programs -- you can help. I'll be doing some digging myself.

Music

Spotify's Car Thing, Due For Bricking, Is Getting an Open Source Second Life (arstechnica.com) 15

If you have Spotify's soon-to-be-bricked Car Thing, there are a few ways you can give it a new lease on life. YouTuber Dammit Jeff has showcased modifications to Car Thing that makes the device useful as a desktop music controller, customizable shortcut tool, or a simple digital clock. Ars Technica's Kevin Purdy reports: Spotify had previously posted the code for its uboot and kernel to GitHub, under the very unassuming name "spsgsb" and with no announcement (as discovered by Josh Hendrickson). Jeff has one idea why the streaming giant might not have made much noise about it: "The truth is, this thing isn't really great at running anything." It has half a gigabyte of memory, 4GB of internal storage, and a "really crappy processor" (Amlogic S905D2 SoC) and is mostly good for controlling music.

How do you get in? The SoC has a built-in USB "burning mode," allowing for a connected computer, running the right toolkit, to open up root access and overwrite its firmware. Jeff has quite a few issues getting connected (check his video description for some guidance), but it's "drag and drop" once you're in. Jeff runs through a few of the most popular options for a repurposed Car Thing:

- DeskThing, which largely makes Spotify desk-friendly, but adds a tiny app store for weather (including Jeff's own WeatherWave), clocks, and alternate music controls
- GlanceThing, which keeps the music controls but also provides some Stream-Deck-like app-launching shortcuts for your main computer.
- Nocturne, currently invite-only, is a wholly redesigned Spotify interface that restores all its Spotify functionality.

The Gimp

GIMP 3.0 Enters RC Testing After 20 Years (tomshardware.com) 55

GIMP 3.0, the long-awaited upgrade from the popular open-source image editor, has entered the release candidate phase, signaling that a stable version may be available by the end of this year or early 2025. Tom's Hardware reports: So, what has changed with the debut of GIMP 3? The new interface is still quite recognizable to classic GIMP users but has been considerably smoothed out and is far more scalable to high-resolution displays than it used to be. Several familiar icons have been carefully converted to SVGs or Scalable Vector Graphics, enabling supremely high-quality, scalable assets.

While PNGs, or Portable Network Graphics, are also known to be high-quality due to their lack of compression, they are still suboptimal compared to SVGs when SVGs are applicable. The work of converting GIMP's tool icons to SVG is still in progress per the original blog post, but it's good that developer Denis Rangelov has already started on the work.

Many aspects of the GIMP 3.0 update are almost wholly on the backend for ensuring project and plugin compatibility with past projects made with previous versions of GIMP. To summarize: a public GIMP API is being stabilized to make it easier to port GIMP 2.10-based plugins and scripts to GIMP 3.0. Several bugs related to color accuracy have been fixed to improve color management while still maintaining compatibility with past GIMP projects.
You can read the GIMP team's blog post here.
Firefox

Firefox Gets More Investment in New Features, Prioritizing People (and Privacy) Over Profit (techcrunch.com) 83

On its 20th anniversary, Firefox "is still going strong, and it is a better browser today than it ever was," according to TechCrunch.

In an interview, Mozilla's interim CEO says one of the first things they did when was to "unlock a bunch of money towards Firefox product development... I've been in enough places where people tend to forget about the core business, and they stop investing in it, because they get distracted by shiny things — and then they regret it." "Firefox is incredibly important, and it is our core. We've actually put more investment into it this year and into connecting with our communities, into bringing out and testing features that are positive and creating good experiences for folks. That's been a huge priority for me and for the company this year, and it's showing up in the results."

She acknowledged that Mozilla doesn't have the device distribution that benefits many of Firefox's competitors, especially on mobile, but she did note that the Digital Marks Act (DMA) in Europe — which means Apple, for example, has to provide a browser choice screen on iOS — is working. "With the DMA, even though the implementation hasn't been outstanding, we're seeing a real shift. When people have the choice to choose Firefox, they're choosing Firefox," she said...

To kick-start some of this growth, Mozilla is looking at reaching new, and younger, users. Chambers noted that Mozilla is running a number of marketing campaigns to make people aware of Firefox, especially those who are only now starting to make their first browser choices. With them, she believes, Mozilla's messaging around privacy lands especially well.

In a future where browsers include AI agents that take actions on behalf of users, there might be more confidence in a browser designed for privacy and transparency, the interim CEO points out — as part of their larger mission. "What I love about Firefox is that it really provides users with an alternative choice of a browser that is just genuinely designed for them.

"We have, from its very inception and throughout, really wanted to create a browser that prioritizes people over profit, prioritizes privacy over anything else, and to have that option, the choice."
Programming

The Team Behind GitHub's 'Atom' IDE Build a Cross-Platform, AI-Optional 'Zed Editor' (itsfoss.com) 29

Nathan Sobo "joined GitHub in late 2011 to build the Atom text editor," according to an online biography, "and he led the Atom team until 2018." Max Brunsfeld joined the Atom team in 2013, and "While driving Atom towards its 1.0 launch during the day, Max spent nights and weekends building Tree-sitter, a blazing-fast and expressive incremental parsing framework that currently powers all code analysis at GitHub."

Last year they teamed up with Antonio Scandurra (another Atom alumnus) to launch a new startup called Zed (which in 2023 raised $10 million, according to TechCrunch). And today the open source blog It's FOSS checks in on their open-source code editor — "Zed Editor". Mainly written in Rust, it supports running in CLI, diagnosing project-wide errors, split panes, and markdown previews: By default, any added content is treated as plain text. I used the language switcher to change it to Rust so that I would get proper syntax highlighting, indentation, error detection, and other useful language-specific functions. The switch highlighted all the Rust elements correctly, and I then focused on Zed Editor's user interface. The overall feel of the editor was minimal, with all the important options being laid out nicely.

[Its status bar] had some interesting panels. The first one I checked was the Terminal Panel, which, as the name suggests, lets you run commands, scripts, and facilitates interaction with system files or processes directly from within the editor. I then moved to the Assistant Panel, which is home to various large language models that can be integrated into Zed Editor. There are options like Anthropic, GitHub Copilot Chat, Ollama, OpenAI, and Google AI... The Zed Editor team has also recently introduced Zed AI in collaboration with Anthropic for assisting with coding, allowing for code generation, advanced context-powered interactions, and more...

The real-time collaboration features on Zed Editor are quite appealing too. To check them out, I had to log in with my GitHub account. After logging in, the Collab Panel opened up, and I could see many channels from the official Zed community. I could chat with others, add collaborators to existing projects, join a call with the option to share my screen and track other collaborators' cursors, add new contacts, and carry out many other collaborative tasks.

One can also use extensions and themes to extend what Zed Editor can do. There are some nice pre-installed themes as well.

Programming

Python Overtakes JavaScript on GitHub, Annual Survey Finds (github.blog) 97

GitHub released its annual "State of the Octoverse" report this week. And while "Systems programming languages, like Rust, are also on the rise... Python, JavaScript, TypeScript, and Java remain the most widely used languages on GitHub."

In fact, "In 2024, Python overtook JavaScript as the most popular language on GitHub." They also report usage of Jupyter Notebooks "skyrocketed" with a 92% jump in usage, which along with Python's rise seems to underscore "the surge in data science and machine learning on GitHub..." We're also seeing increased interest in AI agents and smaller models that require less computational power, reflecting a shift across the industry as more people focus on new use cases for AI... While the United States leads in contributions to generative AI projects on GitHub, we see more absolute activity outside the United States. In 2024, there was a 59% surge in the number of contributions to generative AI projects on GitHub and a 98% increase in the number of projects overall — and many of those contributions came from places like India, Germany, Japan, and Singapore...

Notable growth is occurring in India, which is expected to have the world's largest developer population on GitHub by 2028, as well as across Africa and Latin America... [W]e have seen greater growth outside the United States every year since 2013 — and that trend has sped up over the past few years.

Last year they'd projected India would have the most developers on GitHub #1 by 2027, but now believe it will happen a year later. This year's top 10?

1. United States
2. India
3. China
4. Brazil
5. United Kingdom
6. Russia
7. Germany
8. Indonesia
9. Japan
10. Canada

Interestingly, the UK's population ranks #21 among countries of the world, while Germany ranks #19, and Canada ranks #36.)

GitHub's announcement argues the rise of non-English, high-population regions "is notable given that it is happening at the same time as the proliferation of generative AI tools, which are increasingly enabling developers to engage with code in their natural language." And they offer one more data point: GitHub's For Good First Issue is a curated list of Digital Public Goods that need contributors, connecting those projects with people who want to address a societal challenge and promote sustainable development...

Significantly, 34% of contributors to the top 10 For Good Issue projects... made their first contribution after signing up for GitHub Copilot.

There's now 518 million projects on GitHub — with a year-over-year growth of 25%...
Open Source

New 'Open Source AI Definition' Criticized for Not Opening Training Data (slashdot.org) 38

Long-time Slashdot reader samj — also a long-time Debian developertells us there's some opposition to the newly-released Open Source AI definition. He calls it a "fork" that undermines the original Open Source definition (which was originally derived from Debian's Free Software Guidelines, written primarily by Bruce Perens), and points us to a new domain with a petition declaring that instead Open Source shall be defined "solely by the Open Source Definition version 1.9. Any amendments or new definitions shall only be recognized with clear community consensus via an open and transparent process."

This move follows some discussion on the Debian mailing list: Allowing "Open Source AI" to hide their training data is nothing but setting up a "data barrier" protecting the monopoly, disabling anybody other than the first party to reproduce or replicate an AI. Once passed, OSI is making a historical mistake towards the FOSS ecosystem.
They're not the only ones worried about data. This week TechCrunch noted an August study which "found that many 'open source' models are basically open source in name only. The data required to train the models is kept secret, the compute power needed to run them is beyond the reach of many developers, and the techniques to fine-tune them are intimidatingly complex. Instead of democratizing AI, these 'open source' projects tend to entrench and expand centralized power, the study's authors concluded."

samj shares the concern about training data, arguing that training data is the source code and that this new definition has real-world consequences. (On a personal note, he says it "poses an existential threat to our pAI-OS project at the non-profit Kwaai Open Source Lab I volunteer at, so we've been very active in pushing back past few weeks.")

And he also came up with a detailed response by asking ChatGPT. What would be the implications of a Debian disavowing the OSI's Open Source AI definition? ChatGPT composed a 7-point, 14-paragraph response, concluding that this level of opposition would "create challenges for AI developers regarding licensing. It might also lead to a fragmentation of the open-source community into factions with differing views on how AI should be governed under open-source rules." But "Ultimately, it could spur the creation of alternative definitions or movements aimed at maintaining stricter adherence to the traditional tenets of software freedom in the AI age."

However the official FAQ for the new Open Source AI definition argues that training data "does not equate to a software source code." Training data is important to study modern machine learning systems. But it is not what AI researchers and practitioners necessarily use as part of the preferred form for making modifications to a trained model.... [F]orks could include removing non-public or non-open data from the training dataset, in order to train a new Open Source AI system on fully public or open data...

[W]e want Open Source AI to exist also in fields where data cannot be legally shared, for example medical AI. Laws that permit training on data often limit the resharing of that same data to protect copyright or other interests. Privacy rules also give a person the rightful ability to control their most sensitive information — like decisions about their health. Similarly, much of the world's Indigenous knowledge is protected through mechanisms that are not compatible with later-developed frameworks for rights exclusivity and sharing.

Read on for the rest of their response...
Movies

ASWF: the Open Source Foundation Run By the Folks Who Give Out Oscars (theregister.com) 18

This week's Ubuntu Summit 2024 was attended by Lproven (Slashdot reader #6,030). He's also a FOSS correspondent for the Register, where he's filed this report: One of the first full-length sessions was presented by David Morin, executive director of the Academy Software Foundation, introducing his organization in a talk about Open Source Software for Motion Pictures. Morin linked to the Visual Effects Society's VFX/Animation Studio Workstation Linux Report, highlighting the market share pie-chart, showing Rocky Linux 9 with at some 58 percent and the RHELatives in general at 90 percent of the market. Ubuntu 22 and 24 — the report's nomenclature, not this vulture's — got just 10.5 percent. We certainly didn't expect to see that at an Ubuntu event, with the latest two versions of Rocky Linux taking 80 percent of the studio workstation market...

What also struck us over the next three quarters of an hour is that Linux and open source in general seem to be huge components of the movie special effects industry — to an extent that we had not previously realized.

There's a "sizzle reel" showing examples of how major motion pictures used OpenColorIO, an open-source production tool for syncing color representations originally developed by Sony Pictures Imageworks. That tool is hosted by a collaboration between the Linux Foundation with the Science and Technology Council of the Academy of Motion Picture Arts and Sciences (the "Academy" of the Academy Awards). The collaboration — which goes by the name of the Academy Software Foundation — hosts 14 different projects The ASWF hasn't been around all that long — it was only founded in 2018. Despite the impact of the COVID pandemic, by 2022 it had achieved enough to fill a 45-page history called Open Source in Entertainment [PDF]. Morin told the crowd that it runs events, provides project marketing and infrastructure, as well as funding, training and education, and legal assistance. It tries to facilitate industry standards and does open source evangelism in the industry. An impressive list of members — with 17 Premier companies, 16 General ones, and another half a dozen Associate members — shows where some of the money comes from. It's a big list of big names. [Adobe, AMD, AWS, Autodesk...]
The presentation started with OpenVBD, a C++ library developed and donated by Dreamworks for working with three-dimensional voxel-based shapes. (In 2020 they created this sizzle reel, but this year they've unveiled a theme song.) Also featured was OpenEXR, originally developed at Industrial Light and Magic and sourced in 1999. (The article calls it "a specification and reference implementation of the EXR file format — a losslessly compressed image storage format for moving images at the highest possible dynamic range.")

"For an organization that is not one of the better-known ones in the FOSS space, we came away with the impression that the ASWF is busy," the article concludes. (Besides running Open Source Days and ASWF Dev Days, it also hosts several working groups like the Language Interop Project works on Rust bindings and the Continuous Integration Working Group on CI tools, There's generally very little of the old razzle-dazzle in the Linux world, but with the demise of SGI as the primary maker of graphics workstations — its brand now absorbed by Hewlett Packard Enterprise — the visual effects industry moved to Linux and it's doing amazing things with it. And Kubernetes wasn't even mentioned once.
AI

We Finally Have an 'Official' Definition For Open Source AI (techcrunch.com) 9

There's finally an "official" definition of open source AI. The Open Source Initiative (OSI), a long-running institution aiming to define and "steward" all things open source, today released version 1.0 of its Open Source AI Definition (OSAID). TechCrunch: The product of several years of collaboration with academia and industry, the OSAID is intended to offer a standard by which anyone can determine whether AI is open source -- or not. You might be wondering why consensus matters for a definition of open source AI. Well, a big motivation is getting policymakers and AI developers on the same page, said OSI EVP Stefano Maffulli.

"Regulators are already watching the space," Maffulli told TechCrunch, noting that bodies like the European Commission have sought to give special recognition to open source. "We did explicit outreach to a diverse set of stakeholders and communities -- not only the usual suspects in tech. We even tried to reach out to the organizations that most often talk to regulators in order to get their early feedback." [...] To be considered open source under the OSAID, an AI model has to provide enough information about its design so that a person could "substantially" recreate it. The model must also disclose any pertinent details about its training data, including the provenance, how the data was processed, and how it can be obtained or licensed.

Open Source

Password Manager Bitwarden Makes Changes to Address Concerns Over Open Source Licensing (github.com) 10

Bitwarden describes itself as an "open source password manager for business." But it also made a change to its build requirement which led to an issue on the project's GitHub page titled "Desktop version 2024.10.0 is no longer free software."

In the week that followed Bitwarden's official account on X.com promised a fix was coming. "It seems a packaging bug was misunderstood as something more, and the team plans to resolve it. Bitwarden remains committed to the open source licensing model in place for years, along with retaining a fully featured free version for individual users." And Thursday Bitwarden followed through with new changes to address the concerns.

The Register reports the whole episode started because of a new build requirement added in a pull request a couple of weeks ago titled "Introduce SDK client." This SDK is required to compile the software from source — either the Bitwarden server or any of its client applications... [But the changed license had warned "You may not use this SDK to develop applications for use with software other than Bitwarden (including non-compatible implementations of Bitwarden) or to develop another SDK."]
Phoronix picks up the story: The issue of this effectively not making the Bitwarden client free software was raised in this GitHub issue... Bitwarden founder and CTO Kyle Spearrin has commented on the ticket... "Being able to build the app as you are trying to do here is an issue we plan to resolve and is merely a bug." The ticket was subsequently locked and limited to collaborators.
And Thursday it was Bitwarden founder and CTO Kyle Spearrin who again re-appeared in the Issue — first thanking the user who had highlighted the concerns. "We have made some adjustments to how the SDK code is organized and packaged to allow you to build and run the app with only GPL/OSI licenses included." The sdk-internal package references in the clients now come from a new sdk-internal repository, which follows the licensing model we have historically used for all of our clients (see LICENSE_FAQ.md for more info). The sdk-internal reference only uses GPL licenses at this time. If the reference were to include Bitwarden License code in the future, we will provide a way to produce multiple build variants of the client, similar to what we do with web vault client builds.

The original sdk repository will be renamed to sdk-secrets, and retains its existing Bitwarden SDK License structure for our Secrets Manager business products. The sdk-secrets repository and packages will no longer be referenced from the client apps, since that code is not used there.

Businesses

San Francisco Billboards Call Out Tech Firms For Not Paying For Open Source (theregister.com) 67

An anonymous reader shares a report: Drivers passing through San Francisco have a new roadside distraction to consider: billboards calling out businesses that don't cough up for the open source code that they use. The signs are the work of the Open Source Pledge -- a group that launched earlier this month. It asks businesses that make use of open source code to pledge $2,000 per developer to support projects that develop the code. So far, 25 companies have signed up -- but project co-founder Chad Whitacre wants bigger firms to pay their dues, too.

Whitacre, whose day job is head of open source at app-monitoring biz Sentry, told The Register his employer has for three years operated a scheme to pay developers who maintain and upgrade open source code. "We do dollars per developer, the thinking being it's the developers and software engineers on the staff at a company who benefit the most from open source, who become more productive because of open source," he said. "I had one conversation with a representative from a larger firm and he's like: 'Chad, you're asking me to spend ten million on maintainers.'" Whitacre affirmed that request, and pointed out the firm "spends ten million on something anyway."

Open Source

Google Offers Its AI Watermarking Tech As Free Open Source Toolkit (arstechnica.com) 13

An anonymous reader quotes a report from Ars Technica: Back in May, Google augmented its Gemini AI model with SynthID, a toolkit that embeds AI-generated content with watermarks it says are "imperceptible to humans" but can be easily and reliably detected via an algorithm. Today, Google took that SynthID system open source, offering the same basic watermarking toolkit for free to developers and businesses. The move gives the entire AI industry an easy, seemingly robust way to silently mark content as artificially generated, which could be useful for detecting deepfakes and other damaging AI content before it goes out in the wild. But there are still some important limitations that may prevent AI watermarking from becoming a de facto standard across the AI industry any time soon.

Google uses a version of SynthID to watermark audio, video, and images generated by its multimodal AI systems, with differing techniques that are explained briefly in this video. But in a new paper published in Nature, Google researchers go into detail on how the SynthID process embeds an unseen watermark in the text-based output of its Gemini model. The core of the text watermarking process is a sampling algorithm inserted into an LLM's usual token-generation loop (the loop picks the next word in a sequence based on the model's complex set of weighted links to the words that came before it). Using a random seed generated from a key provided by Google, that sampling algorithm increases the correlational likelihood that certain tokens will be chosen in the generative process. A scoring function can then measure that average correlation across any text to determine the likelihood that the text was generated by the watermarked LLM (a threshold value can be used to give a binary yes/no answer).

GNU is Not Unix

'100% Free' GNU Boot Discovers They've Been Shipping Non-Free Code - Again (phoronix.com) 36

Libreboot is a distribution of coreboot "aimed at replacing the proprietary BIOS firmware contained by most computers."

So then what exactly is GNU Boot? Its home page explains... In November 2022, Libreboot began to include non-libre code. We have made repeated efforts to continue collaboration with those developers to help their version of Libreboot remain libre, but that was not successful. Now we've stepped forward to stand up for freedom, ours and that of the wider community, by maintaining our own version — a genuinely libre Libreboot, that after some hurdles gave birth to this project: GNU Boot.
But today, Phoronix writes: While priding itself on being "100% free", last December [GNU Boot] had to drop some motherboard support and CPU code after discovering they were shipping some files that are non-free by their free software standards. Today they announced another mistake in having inadvertently been shipping additional non-free code.

GNU Boot discovered an issue with non-free code affecting not only them but also some of the Linux distributions that pride themselves on being fully free software / 100% open-source. This latest snafu they say is "more problematic" than their prior non-free code discover due to impacting the free software Linux distributions too. The issue at hand though comes down to test data contained within the archive and that containing non-free code in the form of microcode, BIOS bits, and Intel Management Engine firmware.

"We also contacted Replicant..." according to the announcement, "a free Android distro that also ships vboot source code." And in addition, "We had to re-release all the affected tarballs." (Which at this point is three release candidates...)

Slashdot Top Deals