699
Replit AI went rogue, deleted a company's entire database, then hid it and lied about it
(programming.dev)
Share interesting Technology news and links.
Rules:
To encourage more original sources and keep this space commercial free as much as I could, the following websites are Blacklisted:
More sites will be added to the blacklist as needed.
Encouraged:
I think you're right. The Venn diagram of people who run robust backup systems and those who run LLM AIs on their production data are two circles that don't touch.
Working on a software project. Can you describe a robust backup system? I have my notes and code and other files backed up.
Sure, but it's a bit of an open-ended question because it depends on your requirements (and your clients' potentially), and your risk comfort level. Sorry in advance, huge reply.
When you're backing up an production environment it's different to just backing up personal data so you have to consider stateful-backups of the data across the whole environment - to ensure for instance that an app's config is aware of changes made recently on the database, else you may be restoring inconsistent data that will have issues/errors. For a small project that runs on a single server you can do a nightly backup that runs a pre-backup script to gracefully stop all of your key services, then performs backup, then starts them again with a post-backup script. Large environments with multiple servers (or containers/etc) or sites get much more complex.
Keeping with the single server example - those backups can be stored on a local NAS, synced to another location on schedule (not set to overwrite but to keep multiple copies), and ideally you would take a periodical (eg weekly, whatever you're comfortable with) copy off to a non-networked device like a USB drive or tape, which would also be offsite (eg carried home or stored in a drawer in case of a home office). This is loosely the 3-2-1 strategy is to have at least 3 copies of important data in 2 different mediums ('devices' is often used today) with 1 offsite. It keeps you protected from a local physical disaster (eg fire/burglary), a network disaster (eg virus/crypto/accidental deletion), and has a lot of points of failure so that more than one thing has to go wrong to cause you serious data loss.
Really the best advice I can give is to make a disaster recovery plan (DRP), there are guides online, but essentially you plot out the sequence it would take you to restore your environment to up-and-running with current data, in case of a disaster that takes out your production environment or its data.
How long would it take you to spin up new servers (or docker containers or whatever) and configure them to the right IPs, DNS, auth keys and so on? How long to get the most recent copy of your production data back on that newly-built system and running? Those are the types of questions you try to answer with a DRP.
Once you have an idea of what a recovery would look like and how long it would take, it will inform how you may want to approach your backup. You might decide that file-based backups of your core config data and database files or other unique data is not enough for you (because the restore process may have you out of business for a week), and you'd rather do a machine-wide stateful backup of the system that could get you back up and running much quicker (perhaps a day).
Whatever you choose, the most important step (that is often overlooked) is to actually do a test recovery once you have a backup plan implemented and DR plan considered. Take your live environment offline and attempt your recovery plan. It's really not so hard for small environments, and can make you find all sorts of things you missed in the planning stage that need reconsideration. 'Much less stressful when you find those problems and you know you actually have your real environment just sitting waiting to be turned back on. But like I said it's all down to how comfortable you are with risk, and really how much of your time you want to spend considering backups and DR.
Look up the 3-2-1 rule for guidance on an “industry standard” level of protection.