There is no rollback option (rollback has a different meaning in a MongoDB context), and strictly speaking there is no supported way to get these documents back – the precautions you can/should take are covered in the comments. With that said however, if you are running a replica set, even a single node replica set, then you have an oplog
. With an oplog
that covers when the documents were inserted, you may be able to recover them.
The easiest way to illustrate this is with an example. I will use a simplified example with just 100 deleted documents that need to be restored. To go beyond this (huge number of documents, or perhaps you wish to only selectively restore etc.) you will either want to change the code to iterate over a cursor or write this using your language of choice outside the MongoDB shell. The basic logic remains the same.
First, let’s create our example collection foo
in the database dropTest
. We will insert 100 documents without a name
field and 100 documents with an identical name
field so that they can be mistakenly removed later:
use dropTest;
for(i=0; i < 100; i++){db.foo.insert({_id : i})};
for(i=100; i < 200; i++){db.foo.insert({_id : i, name : "some_x_name"})};
Now, let’s simulate the accidental removal of our 100 name
documents:
> db.foo.remove({ "name" : "some_x_name"})
WriteResult({ "nRemoved" : 100 })
Because we are running in a replica set, we still have a record of these documents in the oplog
(being inserted) and thankfully those inserts have not (yet) fallen off the end of the oplog
(the oplog
is a capped collection remember) . Let’s see if we can find them:
use local;
db.oplog.rs.find({op : "i", ns : "dropTest.foo", "o.name" : "some_x_name"}).count();
100
The count looks correct, we seem to have our documents still. I know from experience that the only piece of the oplog
entry we will need here is the o
field, so let’s add a projection to only return that (output snipped for brevity, but you get the idea):
db.oplog.rs.find({op : "i", ns : "dropTest.foo", "o.name" : "some_x_name"}, {"o" : 1});
{ "o" : { "_id" : 100, "name" : "some_x_name" } }
{ "o" : { "_id" : 101, "name" : "some_x_name" } }
{ "o" : { "_id" : 102, "name" : "some_x_name" } }
{ "o" : { "_id" : 103, "name" : "some_x_name" } }
{ "o" : { "_id" : 104, "name" : "some_x_name" } }
To re-insert those documents, we can just store them in an array, then iterate over the array and insert the relevant pieces. First, let’s create our array:
var deletedDocs = db.oplog.rs.find({op : "i", ns : "dropTest.foo", "o.name" : "some_x_name"}, {"o" : 1}).toArray();
> deletedDocs.length
100
Next we remind ourselves that we only have 100 docs in the collection now, then loop over the 100 inserts, and finally revalidate our counts:
use dropTest;
db.foo.count();
100
// simple for loop to re-insert the relevant elements
for (var i = 0; i < deletedDocs.length; i++) {
db.foo.insert({_id : deletedDocs[i].o._id, name : deletedDocs[i].o.name});
}
// check total and name counts again
db.foo.count();
200
db.foo.count({name : "some_x_name"})
100
And there you have it, with some caveats:
- This is not meant to be a true restoration strategy, look at backups (MMS, other), delayed secondaries for that, as mentioned in the comments
- It’s not going to be particularly quick to query the documents out of the oplog (any oplog query is a table scan) on a large busy system.
- The documents may age out of the oplog at any time (you can, of course, make a copy of the oplog for later use to give you more time)
- Depending on your workload you might have to de-dupe the results before re-inserting them
- Larger sets of documents will be too large for an array as demonstrated, so you will need to iterate over a cursor instead
- The format of the
oplog
is considered internal and may change at any time (without notice), so use at your own risk