Linux News | Slashdot

Archivists Work To Identify and Save the Thousands of Datasets Disappearing From Data.gov (404media.co) 70

Posted by BeauHD on Thursday January 30, 2025 @07:45PM from the lost-and-found dept.

An anonymous reader quotes a report from 404 Media: Datasets aggregated on data.gov, the largest repository of U.S. government open data on the internet, are being deleted, according to the website's own information. Since Donald Trump was inaugurated as president, more than 2,000 datasets have disappeared from the database. As people in the Data Hoarding and archiving communities have pointed out, on January 21, there were 307,854 datasets on data.gov. As of Thursday, there are 305,564 datasets. Many of the deletions happened immediately after Trump was inaugurated, according to snapshots of the website saved on the Internet Archive's Wayback Machine. Harvard University researcher Jack Cushman has been taking snapshots of Data.gov's datasets both before and after the inauguration, and has worked to create a full archive of the data.

"Some of [the entries link to] actual data," Cushman told 404 Media. "And some of them link to a landing page [where the data is hosted]. And the question is -- when things are disappearing, is it the data it points to that is gone? Or is it just the index to it that's gone?" For example, "National Coral Reef Monitoring Program: Water Temperature Data from Subsurface Temperature Recorders (STRs) deployed at coral reef sites in the Hawaiian Archipelago from 2005 to 2019," a NOAA dataset, can no longer be found on data.gov but can be found on one of NOAA's websites by Googling the title. "Stetson Flower Garden Banks Benthic_Covage Monitoring 1993-2018 -- OBIS Event," another NOAA dataset, can no longer be found on data.gov and also appears to have been deleted from the internet. "Three Dimensional Thermal Model of Newberry Volcano, Oregon," a Department of Energy resource, is no longer available via the Department of Energy but can be found backed up on third-party websites. [...]

Data.gov serves as an aggregator of datasets and research across the entire government, meaning it isn't a single database. This makes it slightly harder to archive than any individual database, according to Mark Phillips, a University of Northern Texas researcher who works on the End of Term Web Archive, a project that archives as much as possible from government websites before a new administration takes over. "Some of this falls into the 'We don't know what we don't know,'" Phillips told 404 Media. "It is very challenging to know exactly what, where, how often it changes, and what is new, gone, or going to move. Saving content from an aggregator like data.gov is a bit more challenging for the End of Term work because often the data is only identified and registered as a metadata record with data.gov but the actual data could live on another website, a state .gov, a university website, cloud provider like Amazon or Microsoft or any other location. This makes the crawling even more difficult."

Phillips said that, for this round of archiving (which the team does every administration change), the project has been crawling government websites since January 2024, and that they have been doing "large-scale crawls with help from our partners at the Internet Archive, Common Crawl, and the University of North Texas. We've worked to collect 100s of terabytes of web content, which includes datasets from domains like data.gov." [...] It is absolutely true that the Trump administration is deleting government data and research and is making it harder to access. But determining what is gone, where it went, whether it's been preserved somewhere, and why it was taken down is a process that is time intensive and going to take a while. "One thing that is clear to me about datasets coming down from data.gov is that when we rely on one place for collecting, hosting, and making available these datasets, we will always have an issue with data disappearing," Phillips said. "Historically the federal government would distribute information to libraries across the country to provide greater access and also a safeguard against loss. That isn't done in the same way for this government data."

Microsoft Makes DeepSeek's R1 Model Available On Azure AI and GitHub 30

Posted by BeauHD on Wednesday January 29, 2025 @07:20PM from the that-didn't-take-long dept.

Comcast Is Rolling Out 'Ultra-Low Lag' Tech That Could Fix the Internet (theverge.com) 80

Posted by msmash on Wednesday January 29, 2025 @02:10PM from the moving-forward dept.

Cloud Services Market Is 'Not Working,' Says UK Regulator (www.gov.uk) 39

Posted by msmash on Tuesday January 28, 2025 @11:20AM from the how-about-that dept.

Should Big Tech Plug Its Data Centers Directly Into Power Plants? (apnews.com) 86

Posted by EditorDavid on Monday January 27, 2025 @01:56AM from the off-the-grid dept.

"Looking for a quick fix for their fast-growing electricity diets, tech giants are increasingly looking to strike deals with power plant owners to plug in directly," reports the Associated Press, "avoiding a potentially longer and more expensive process of hooking into a fraying electric grid that serves everyone else." (It can take up to four years to connect a data center to the grid, one data center trade group says in the article — years longer than it takes to build a new data center.)

But the idea of bypassing the grid is "raising questions over whether diverting power to higher-paying customers will leave enough for others and whether it's fair to excuse big power users from paying for the grid." Front and center is the data center that Amazon's cloud computing subsidiary, Amazon Web Services, is building next to the Susquehanna nuclear plant in eastern Pennsylvania. The arrangement between the plant's owners and AWS — called a "behind the meter" connection — is the first such to come before the Federal Energy Regulatory Commission. For now, FERC has rejected a deal that could eventually send 960 megawatts — about 40% of the plant's capacity — to the data center. That's enough to power more than a half-million homes... [But the FERC's 2-1 rejection "was procedural. Recent comments by commissioners suggest they weren't ready to decide how to regulate such a novel matter without more study."]

In theory, the AWS deal would let Susquehanna sell power for more than they get by selling into the grid... The profit potential is one that other nuclear plant operators, in particular, are embracing after years of financial distress and frustration with how they are paid in the broader electricity markets. Many say they have been forced to compete in some markets against a flood of cheap natural gas as well as state-subsidized solar and wind energy. Power plant owners also say the arrangement benefits the wider public, by bypassing the costly buildout of long power lines and leaving more transmission capacity on the grid for everyone else...

Monitoring Analytics, the market watchdog in the mid-Atlantic grid, wrote in a filing to FERC that the impact would be "extreme" if the Susquehanna-AWS model were extended to all nuclear power plants in the territory. Energy prices would increase significantly and there's no explanation for how rising demand for power will be met even before big power plants drop out of the supply mix, it said.

Bambu Labs' 3D Printer 'Authorization' Update Beta Sparks Concerns (theverge.com) 47

Posted by EditorDavid on Saturday January 25, 2025 @12:34PM from the printer-net-service-providers dept.

Slashdot reader jenningsthecat writes: 3D printer manufacturer Bambu Labs has faced a storm of controversy and protest after releasing a security update which many users claim is the first step in moving towards an HP-style subscription model.
Bambu Labs responded that there's misinformation circulating online, adding "we acknowledge that our communication might have contributed to the confusion." Bambu Labs spokesperson Nadia Yaakoubi did "damage control", answering questions from the Verge: Q: Will Bambu publicly commit to never requiring a subscription in order to control its printers and print from them over a home network?

A: For our current product line, yes. We will never require a subscription to control or print from our printers over a home network...

Q: Will Bambu publicly commit to never putting any existing printer functionality behind a subscription?

Yes...
Bambu's site adds that the security update "is beta testing, not a forced update. The choice is yours. You can participate in the beta program to help us refine these features, or continue using your current firmware."

Hackaday notes another wrinkle: This follows the original announcement which had the 3D printer community up in arms, and quickly saw the new tool that's supposed to provide safe and secure communications with Bambu Lab printers ripped apart to extract the security certificate and private key... As the flaming wreck that's Bambu Lab's PR efforts keeps hurtling down the highway of public opinion, we'd be remiss to not point out that with the security certificate and private key being easily obtainable from the Bambu Connect Electron app, there is absolutely no point to any of what Bambu Lab is doing.
The Verge asked Bambu Labs about that too: Q: Does the private key leaking change any of your plans?

No, this doesn't change our plans, and we've taken immediate action.
Bambu Labs had said their security update would "ensure only authorized access and operations are permitted," remembers Ars Technica. "This would, Bambu suggested, mitigate risks of 'remote hacks or printer exposure issues' and lower the risk of 'abnormal traffic or attacks.'" This was necessary, Bambu wrote, because of increases in requests made to its cloud services "through unofficial channels," targeted DDOS attacks, and "peaks of up to 30 million unauthorized requests per day" (link added by Bambu).
But Ars Technica also found some skepticism online: Repair advocate Louis Rossmann, noting Bambu's altered original blog post, uploaded a video soon after, "Bambu's Gaslighting Masterclass: Denying their own documented restrictions"... suggesting that the company was asking buyers to trust that Bambu wouldn't enact restrictive policies it otherwise wrote into its user agreements.
And Ars Technica also cites another skeptical response from a video posted by open source hardware hacker and YouTube creator Jeff Geerling: "Every IoT device has these problems, and there are better ways to secure things than by locking out access, or making it harder to access, or requiring their cloud to be integrated."

Netflix's Cloud Plans Include Co-Op and Party Games (theverge.com) 9

Posted by BeauHD on Friday January 24, 2025 @05:30PM from the what-to-expect dept.

FBI: North Korean IT Workers Steal Source Code To Extort Employers (bleepingcomputer.com) 27

Posted by msmash on Friday January 24, 2025 @02:50PM from the PSA dept.

OpenAI's Stargate Deal Heralds Shift Away From Microsoft 38

Posted by msmash on Thursday January 23, 2025 @10:00AM from the growing-apart dept.

Google Reportedly Worked Directly With Israel's Military On AI Tools 66

Posted by BeauHD on Wednesday January 22, 2025 @06:40PM from the behind-the-scenes dept.

Microsoft Loses Status as OpenAI's Exclusive Cloud Provider 8

Posted by msmash on Wednesday January 22, 2025 @02:00AM from the things-change dept.

EA's Origin App For PC Gaming Will Shut Down In April 17

Posted by BeauHD on Tuesday January 21, 2025 @06:20PM from the end-of-the-line dept.

Microsoft-OpenAI Partnership Raises Antitrust Concerns, FTC Says (bloomberg.com) 2

Posted by msmash on Friday January 17, 2025 @05:20PM from the rising-concerns dept.

Ransomware Crew Abuses AWS Native Encryption, Sets Data-Destruct Timer for 7 Days (theregister.com) 18

Posted by BeauHD on Tuesday January 14, 2025 @06:00AM from the systemic-risks dept.

A new ransomware group called Codefinger targets AWS S3 buckets by exploiting compromised or publicly exposed AWS keys to encrypt victims' data using AWS's own SSE-C encryption, rendering it inaccessible without the attacker-generated AES-256 keys. While other security researchers have documented techniques for encrypting S3 buckets, "this is the first instance we know of leveraging AWS's native secure encryption infrastructure via SSE-C in the wild," Tim West, VP of services with the Halcyon RISE Team, told The Register. "Historically AWS Identity IAM keys are leaked and used for data theft but if this approach gains widespread adoption, it could represent a significant systemic risk to organizations relying on AWS S3 for the storage of critical data," he warned. From the report: ... in addition to encrypting the data, Codefinder marks the compromised files for deletion within seven days using the S3 Object Lifecycle Management API â" the criminals themselves do not threaten to leak or sell the data, we're told. "This is unique in that most ransomware operators and affiliate attackers do not engage in straight up data destruction as part of a double extortion scheme or to otherwise put pressure on the victim to pay the ransom demand," West said. "Data destruction represents an additional risk to targeted organizations."

Codefinger also leaves a ransom note in each affected directory that includes the attacker's Bitcoin address and a client ID associated with the encrypted data. "The note warns that changes to account permissions or files will end negotiations," the Halcyon researchers said in a report about S3 bucket attacks shared with The Register. While West declined to name or provide any additional details about the two Codefinger victims -- including if they paid the ransom demands -- he suggests that AWS customers restrict the use of SSE-C.

"This can be achieved by leveraging the Condition element in IAM policies to prevent unauthorized applications of SSE-C on S3 buckets, ensuring that only approved data and users can utilize this feature," he explained. Plus, it's important to monitor and regularly audit AWS keys, as these make very attractive targets for all types of criminals looking to break into companies' cloud environments and steal data. "Permissions should be reviewed frequently to confirm they align with the principle of least privilege, while unused keys should be disabled, and active ones rotated regularly to minimize exposure," West said. An AWS spokesperson said it notifies affected customers of exposed keys and "quickly takes any necessary actions, such as applying quarantine policies to minimize risks for customers without disrupting their IT environment."

They also directed users to this post about what to do upon noticing unauthorized activity.

Euro-Cloud Anexia Moves 12,000 VMs Off VMware to Homebrew KVM Platform (theregister.com) 57

Posted by BeauHD on Monday January 13, 2025 @05:20PM from the existential-migration dept.

Database Tables of Student, Teacher Info Stolen From PowerSchool In Cyberattack (theregister.com) 18

Posted by BeauHD on Friday January 10, 2025 @06:40PM from the class-act dept.

An anonymous reader quotes a report from The Register: A leading education software maker has admitted its IT environment was compromised in a cyberattack, with students and teachers' personal data -- including some Social Security Numbers and medical info -- stolen. PowerSchool says its cloud-based student information system is used by 18,000 customers around the globe, including the US and Canada, to handle grading, attendance records, and personal information of more than 60 million K-12 students and teachers. On December 28 someone managed to get into its systems and access their contents "using a compromised credential," the California-based biz told its clients in an email seen by Register this week.

[...] "We believe the unauthorized actor extracted two tables within the student information system database," a spokesperson told us. "These tables primarily include contact information with data elements such as name and address information for families and educators. "For a certain subset of the customers, these tables may also include Social Security Number, other personally identifiable information, and limited medical and grade information. "Not all PowerSchool student information system customers were impacted, and we anticipate that only a subset of impacted customers will have notification obligations." While the company has tightened security measures and offered identity protection services to affected individuals, cybersecurity firm Cyble suggests the intrusion "may have been more serious and gone on much longer than has been publicly acknowledged so far," reports The Register. The cybersecurity vendor says the intrusion could have occurred as far back as June 16, 2011, with it ending on January 2 of this year.

"Critical systems and applications such as Oracle Netsuite ERP, HR software UltiPro, Zoom, Slack, Jira, GitLab, and sensitive credentials for platforms like Microsoft login, LogMeIn, Windows AD Azure, and BeyondTrust" may have been compromised, too.

2018	Game Company Fires Two Employees Who Complained About 'Mansplaining' on Twitter	1056 comments
2009	Google Announces Chrome OS, For Release Mid-2010	1089 comments
2008	Hans Reiser Leads Police To Nina's Body	1523 comments
2007	Forget Math to Become a Great Computer Scientist?	942 comments
2002	Will Earth Expire By 2050?	1638 comments