Open Source Data Sets? Linux Foundation Introduces 'Community Data License Agreements' (linuxinsider.com) 31
"In open source philosophy, you share source code. Why not share data?" writes Slashdot reader princelobga. Linux Insider reports on the Linux Foundation's new Community Data License Agreement, "a new framework for sharing large sets of data required for research, collaborative learning and other purposes."
CDLAs will allow both individuals and groups to share data sets in the same way they share open source software code, the foundation said. "As systems require data to learn and evolve, no one organization can build, maintain and source all data required," noted Mike Dolan, VP of strategic programs at The Linux Foundation. "Data communities are forming around artificial intelligence and machine learning use cases, autonomous systems, and connected civil infrastructure," he told LinuxInsider. "The CDLA license agreements enable sharing data openly, embodying best practices learned over decades of sharing source code."
A principal analyst at Pund-IT told the site that the new data license "reflects the growing importance of information as a resource for big data analytics, machine learning and artificial intelligence."
A principal analyst at Pund-IT told the site that the new data license "reflects the growing importance of information as a resource for big data analytics, machine learning and artificial intelligence."
Re: (Score:2)
Obviously you've _never_ worked on any games. Art, Music, Levels, Previous Executables, etc. are all valid things to keep in a repo.
Stop using shitty source repositories that don't know how to handle binary blobs.
Scientists should have been doing this all along so they can get independent confirmation.
Re: (Score:2)
Don't put binary data to the source repository.
Utter bullshit. It's perfectly fine to put binary data in a source repository. Binary data, code tests, unit tests, images, codecs, music/sounds, validation playbooks- all that stuff BELONGS in a source code repository.
Frankly, if you're doing development at any non-trivial level, you would be an idiot not to store all that stuff in a repository.
"Our building burned down and we lost all of the art and music and 3-D models we painstakingly made, but we still have the (now useless) source code!"
Open Data? (Score:3)
How/Why is this different from the Open Data [opendatacommons.org] license?
Re: (Score:2)
It's not strategic. Probably;y not webscale either.
Yeah no copyright on data in the US (Score:2)
That's absolutely right, in the US, at least, there is no copyright on a collection of facts. I don't know if any other countries might allow it on a specific compilation of data. Obviously copyright on a single, discreet fact wouldn't make any sense.
In the US, a copyright could apply to a creative arrangement and formatting of facts. (Much as there is no copyright on musical notes, but there can be on a specific, creative arrangement of specific notes, a song).
So under copyright, you can take someone's da
Re:Yeah no copyright on data in the US (Score:4, Informative)
At least in the EU there is a special addition to copyright, which is specifically aimed at databases. They are considered creative works once effort has been put into compiling them. So for at least the EU, this license makes a great deal of sense.
The most pressing problem is of course, that most open data has little documentation and is hardly any use. But hey, at least we got the legal issues fixed once a license can be slapped onto it. Who needs to actually work with it, anyway?
Re: (Score:2)
From their FAQ:
This is a really strange statement to me. They apparently didn't attempt to address any problem with any existing license, but went ahead and rolled their own license without any consideration for the development
Data licenses can be tricky (Score:3)
I see several comments already implying that open licenses specifically for data are unnecessary because we already have free open source licenses for code, but they're not the same.
Most of our open source licenses use copyright law as their foundations. Different legal systems treat the idea of copyrighting facts somewhat differently, but in the US, facts aren't copyrightable. That means that trying to apply a FOSS license to data can be fraught--how can the copyright-based license apply if data aren't copyrightable?.
In the EU, facts aren't copyrightable, but "databases" are--where a "database" is a collection of data that has had value added by efforts to organize the data, for example (they call this "sui generis").
How do you deal with jurisdictional incompatibilities? Good open data licenses spell out the solutions to these conflicts. I see nothing about them in the CDLA. That alone would make me extremely hesitant to try to use it on any data product I publish.
Ask Equifax (Score:2)
They already provides any information you want freely. ;)