MongoDB: is indexing a pain? -


speaking in general, want know best practices querying (and therefore indexing) of schemaless data structures? (i.e. documents)

lets use mongodb store , query deterministic data structures in collection. @ point documents have same structure therefore can create indexes queries in app since know each document has required field(s) index.

what happens after change structure , try save new documents db? lets joined 2 fields firstname , lastname fullname. result collection contains nondeterministic data. see 2 problems here:

  • old indexes cannot cover new data, therefore new indexes needed handle both fields old , new
  • app should take care of dealing 2 representations of documents

this may result in big problem when there many changes in db resulting in many versions of document structures.

i see 2 main approaches:

  • lazy migration. means each document migrated on demand (i.e. after loading collection) final structure , stored colection. approach not solve problems because concedes nondeterminism @ point of time.
  • forced migration. same approach rdbms migrations. migration performed documents @ 1 point of time while app not run. main con downtime of app.

so question: there way of solving problem, without app downtime?

if can't have downtime choice migrations "on fly":

  1. change application when new documents saved new field created, read old ones.
  2. update collection script/queries add new field in collection.
  3. create new indexes on field.
  4. change application reads new fields.
  5. drop unnecessary indexes , remove old fields documents.

changing schema on live database never easy process, no matter database use. requires forward thinking , careful planning.

is indexing pain?

indexing not pain, premature optimization is. should test , check need indexes before adding them , when have them, check being used.

if you're worried performance issues on live system when creating indexes, should consider having replica sets , doing rolling maintenance (in short: taking secondaries down replication, creating indexes on them, bringing them replication , repeating process subsequent replica set members).

edit

what describing process of migrating schema new 1 while temporary supporting both versions of documents.

in step 1, you're adding support multiple versions of documents. you're updating existing documents i.e. creating new fields, while you're reading data previous version fields. step 2 optional, because can gradually update documents being saved.

in step 4 you're removing support previous versions application code , migrating new version. finally, in step 5 you're removing previous version fields actual mongodb documents.


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -