================
Django Evolution
================

When you run ``./manage.py syncdb``, Django will look for any new models that
have been defined, and add a database table to represent those new models.
However, if you make a change to an existing model, ``./manage.py syncdb`` will
not make any changes to the database.

This is where **Django Evolution** fits in. Django Evolution is an extension
to Django that allows you to track changes in your models over time, and to
update the database to reflect those changes.

.. admonition:: Work in Progress

    Django Evolution is a work in progress. The interface and usage described
    in this document is subject to change as we finesse the details. The SQL
    that is automatically generated may occasionally be wrong. There are some
    significant limitations to the capabilities of the system. In short,
    Django Evolution is not yet ready for a production rollout.

An example
==========

To demonstrate how Django Evolution works, let's work through the lifecycle of
a Django application - a new super-cool blogging application called
``Blogette``. After some initial design work, we come up with the following
models::

    class Author(models.Model):
        name = models.CharField(max_length=50)
        email = models.EmailField()
        date_of_birth = models.DateField()

        def __unicode__(self):
            return self.name

    class Entry(models.Model):
        headline = models.CharField(max_length=255)
        body_text = models.TextField()
        pub_date = models.DateTimeField()
        author = models.ForeignKey(Author)

        def __unicode__(self):
            return self.headline

We create a test project, create our ``blogette`` application directory, put
the model definitions in the models.py file, add ``blogette`` to
``settings.INSTALLED_APPS``, and run ``./manage.py syncdb``. This installs
the application, creating the Entry and Author tables.

The first changes
=================

But why do we want to track the date of birth of the Author? In retrospect,
this field was a mistake, so let's delete that field from the model.
However, if we just delete the field and run ``syncdb`` again, Django
won't do anything to respond to the change. We need to use a slightly
different process.

Before we start making changes to ``models.py``, we need to set up the
project to allow evolutions. To do this, we add ``django_evolution`` to the
``settings.INSTALLED_APPS`` and run ``./manage.py syncdb``. This sets up
the tables for tracking model changes::

    $ ./manage.py syncdb
    Creating table django_evolution
    Installing baseline version for testproject.blogette
    Loading 'initial_data' fixtures...
    No fixtures found.

Now we can go into ``models.py`` remove the ``date_of_birth`` field.
After removing the field, ``./manage.py syncdb`` will provide a warning
that changes have been detected::

    $ ./manage.py syncdb
    Project signature has changed - an evolution is required
    Loading 'initial_data' fixtures...
    No fixtures found.

The evolution process itself is controlled using a new management command --
**evolve**. This command is made available when you install the
``django_evolution`` application in your project.

If you want to know what has changed, you can use ask Django Evolution
to provide a hint::

    $ ./manage.py evolve --hint
    #----- Evolution for blogette
    from django_evolution.mutations import *
    from django.db import models

    MUTATIONS = [
         DeleteField('Author', 'date_of_birth')
    ]
    #----------------------
    Trial evolution successful.
    Run './manage.py evolve --execute' to apply evolution.

The output of the hint is a Django Evolution. An Evolution is python code
that defines a list of mutations that need to be made to update a model -
in this case, the fact that we need to delete the ``date_of_birth`` field
from the model ``Author``.

If you want to see what the SQL would look like for this evolution, you can
use the ``--sql`` option::

    $ ./manage.py evolve --hint --sql
    ;; Compiled evolution SQL for blogette
    ALTER TABLE "blogette_person" DROP COLUMN "date_of_birth" CASCADE;

.. note::

        The SQL output shown here is what would be produced by the Postgres 
        backend; if you are using a different backend, the SQL produced by 
        this step will be slightly different.
        
If we wanted to, we could pipe this SQL output directly into our database.
however, we don't need to - we can get Django Evolution to do this for us,
by passing the execute option to ``evolve``. When you do this, you will
be given a stern warning that you may destroy data::

    $ ./manage.py evolve --hint --execute

    You have requested a database evolution. This will alter tables
    and data currently in the 'blogette' database, and may result in
    IRREVERSABLE DATA LOSS. Evolutions should be *thoroughly* reviewed
    prior to execution.

    Are you sure you want to execute the evolutions?

    Type 'yes' to continue, or 'no' to cancel:

Assuming you answer ``yes``, Django Evolution should respond::

    Evolution successful.

Now if we re-run syncdb, we find that there are no evolution warnings::

    $ ./manage.py syncdb
    Loading 'initial_data' fixtures...
    No fixtures found.

If we were to inspect the database itself, we will find that the
``date_of_birth`` field has been deleted.

Stored evolutions
=================

At this point, we are happy with the model definitions. Once we have
developed the views and templates for Blogette, we can deploy Version 1
of Blogette on our production server. We copy our source code to the
production server, run syncdb, and the production server will have tables
that represent the latest version of our models.

Now we can start work on Blogette Version 2. For this release, we decide
to add a 'location' field to Authors, and a 'summary' field to entries.
This means we now will need to make changes to models.py. However, we
now have a production server with the Version 1 models deployed. This
means that we need to keep track of any changes we make during the
development process so that we can duplicate those changes on our
production server.

To do this, we make use of stored evolutions - a mechanism for defining
and reusing collections of evolutions for later use.

Mutation definitions
--------------------

Let's start with adding the ``location`` field to ``Authors``. First, we edit
``models.py`` to add a new CharField definition, with a length of 100.
We will also allow null values, since we have existing data that won't have
an author location. Add the following field definition to authors::

    location = models.CharField(max_length=100, null=True)

After making this change to the Author model, we know an evolution will be
required; if we run ``evolve --hint`` we will get the exact change required::

    $ ./manage.py evolve --hint
    #----- Evolution for blogette
    from django_evolution.mutations import *
    from django.db import models

    MUTATIONS = [
         AddField('Author', 'location', models.CharField, max_length=100, null=True)
    ]
    #----------------------
    Trial evolution successful.
    Run './manage.py evolve --execute' to apply evolution.

At this point, we *could* just run ``evolve --hint --execute`` to update the
development server. However, we want to remember this change, so we need to store
it in a way that Django Evolution can use it.

To do this, we need to create an evolution store. In the blogette directory,
we create an ``evolutions`` module. In this module, we create two files -
``__init__.py``, and ``add_location.py``. The ``blogette`` application
directory should now look something like this::

    /blogette
        /evolutions
            __init__.py
            add_location.py
        models.py
        views.py

``add_location.py`` is a file used to describe the evolution we want to
perform. The contents of this file is exactly the same as the
``evolve --hint`` command produced -- just copy the content between the
lines marked ``----`` into ``add_location.py``.

We then need to define an evolution sequence. This sequence defines the order
in which evolutions need to be applied. Since we only have one evolution,
the definition looks like this::

    SEQUENCE = ['add_location']

Put this statement in ``__init__.py``, and you've have defined your first
stored evolution.

Now we can apply the evolution. We don't need a hint this time - we have
a stored evolution, so we can just run ``evolve``::

    $ ./manage.py evolve
    #----- Evolution for blogette
    from django_evolution.mutations import *
    from django.db import models

    MUTATIONS = [
         AddField('Author', 'location', models.CharField, max_length=100, null=True)
    ]
    #----------------------
    Trial evolution successful.
    Run './manage.py evolve --execute' to apply evolution.

This shows us the sequence of mutations described by our stored evolution
sequence. It also tells us that the trial evolution was successful - every
time you run ``evolve``, Django Evolution will simulate the changes to make
sure that the mutations that are defined will reconcile the differences
between the models.py file and the state of the database.

Since the simulation was successful, we can apply the evolution using the
``--execute`` flag::

    $ ./manage.py evolve --execute
    ...
    Evolution successful.

.. note::

    The warning and prompt for confirmation will be displayed every time
    you evolve your database. It is omitted for clarity in this and later
    examples. If you don't want to be prompted every time, use the
    ``--noinput`` option.

SQL mutations
-------------

Now we need to add the ``summary`` field to ``Entry``. We could follow the
same procedure - however, we going to do something a little different.
Rather than define a Python Evolution file, we're going to define our
mutation in raw SQL.

The process of adding a stored SQL evolution is very similar to adding a
stored Python evolution. In this case, we're going to call our mutation
``add_summary``, so we create a file called ``add_summary.sql`` to the
evolutions directory. Into this file, we put the SQL that will make the
change we require. In our case, this means::

    ALTER TABLE blogette_entry ADD COLUMN summary varchar(100) NULL;

Then, we add the new evolution to the evolution sequence in
``evolutions/__init__.py``. The sequence should now look like this::

    SEQUENCE = ['add_location', 'add_summary']

We have now defined our SQL mutation, and how it should be executed, so we
can trial the evolution::

    $ ./manage.py evolve
    #----- Evolution for blogette
    from django_evolution.mutations import *
    from django.db import models

    MUTATIONS = [
        SQLMutation('add_summary')
    ]
    #----------------------
    Evolution could not be simulated, possibly due to raw SQL mutations.

Unfortunately, Django Evolution can't simulate SQL mutations, so we can't be
sure that the mutation is correct. However, we can inspect the SQL that will
be used using the --sql option::

    $ ./manage.py evolve --sql
    ;; Compiled evolution SQL for blogette
    ALTER TABLE blogette_entry ADD COLUMN summary varchar(100) NULL;

If we are satisfied that this SQL is correct, we can execute it::

    $ ./manage.py evolve --execute
    ...
    Evolution successful.

Meanwhile, on the production site...
------------------------------------

Now that we have finished Blogette Version 2, we can update our existing
production server. We copy the Version 2 source code, including the evolutions,
to the production server. We then run ./manage.py syncdb, which reports that
evolutions are required::

    $ ./manage.py syncdb
    Project signature has changed - an evolution is required
    There are unapplied evolutions for blogette.
    Loading 'initial_data' fixtures...
    No fixtures found.

If we run evolve, we can see the full sequence of mutations that will be
applied::

    $ ./manage.py evolve
    #----- Evolution for blogette
    from django_evolution.mutations import *
    from django.db import models

    MUTATIONS = [
        AddField('Author', 'location', models.CharField, max_length=100, null=True),
        SQLMutation('add_summary')
    ]
    #----------------------
    Evolution could not be simulated, possibly due to raw SQL mutations.

Again, since there is a raw SQL migration involved, we will need to validate
the migration ourselves using the ``--sql`` option::

    $ ./manage.py evolve --sql
    ;; Compiled evolution SQL for testproject.blogette
    ALTER TABLE blogette_author ADD COLUMN location varchar(100) NULL;
    ALTER TABLE blogette_entry ADD COLUMN summary varchar(100) NULL;

If we are happy with this sequence, we can apply it::

    $ ./manage.py evolve --execute
    ...
    Evolution successful.

Our production site now has a database that mirrors the changes made on
the development server.

Reference
=========

The Contract
------------

Django Evolution imposes one important restriction on your development process:
If you intend to make a change to your database, you **must** do it through
Django Evolution. If you modify the database outside of Django and the Django
Evolution framework, you're on your own.

The operation of Django Evolution is entirely based upon the observing and
storing changes to the models.py file. As a result, any changes to the database
will not be observed by Django Evolution, and won't be used in evaluating hints
or establishing if evolution simulations have been successful.

.. admonition:: Room for improvement

    This is one area of Django Evolution that could be significantly improved.
    Adding database introspection to allow for external database modifications
    is on the plan for the future, but for the moment, stick to the contract.

Usage of the ``evolve`` command
-------------------------------

``./manage.py evolve`` does not require any arguments or options. When run 
without any options, ``./manage.py evolve`` will list the evolutions that
have been stored and are available, but have not yet been applied to the
database.

You may optionally provide the name of one or more applications as an argument
to evolve. This will restrict the output of the command to those applications
that are named.

You cannot specify application names if you use the ``--execute``
option. Evolutions cannot be executed on a per-application basis. They must
be applied across the entire project, or not at all.

The following options may also be used to alter the behavior of the ``evolve``
command.

--hint
~~~~~~

Provide a hinted list of mutations for migrating. If no application labels are
provided, hints for the entire project will be generated. If one or more
application names are provided, the hints provided will be restricted to those
applications.

May be combined with ``--sql`` to generate hinted mutations in SQL format.

May be combined with ``--execute`` to apply the changes to the database.

--sql
~~~~~

Convert an evolution from Python syntax to SQL syntax.

If ``--hint`` is specified, the hinted list of mutations will be converted. If
``--hint`` is not specified, the generated SQL will be for any stored
evolutions that have not been applied.

--execute (-x)
~~~~~~~~~~~~~~

Apply evolutions to the database.

If ``--hint`` is specified, the hinted list of mutations will be applied. If
``--hint`` is not specified, this command will apply any unapplied stored
evolutions.

.. note::

    You cannot specify an application name if you are trying to execute an
    evolution. Evolutions must be applied across the entire database.

--purge
~~~~~~~

Remove any stale applications from the database. 

If you remove an application from the ``INSTALLED_APPS`` list, the database
tables for that application also need to be removed. Django Evolution will 
only remove these tables if you specify ``--purge`` as a command line 
argument.

--noinput
~~~~~~~~~

Use the ``--noinput`` option to suppress all user prompting, such as
"Are you sure?" confirmation messages. This is useful if ``django-admin.py``
is being executed as an unattended, automated script.

--verbosity
~~~~~~~~~~~

Use ``--verbosity`` to specify the amount of notification and debug 
information that the evolve command should print to the console.

    * ``0`` means no input.
    * ``1`` means normal input (default).
    * ``2`` means verbose input.

Built-in Mutations
------------------

Django Evolution comes with a number of mutations built-in. These mutations
cover the simple cases for model evolutions.

AddField(model_name, field_name, initial=None, \*\*kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Add the field 'field_name' to the model, using the provided initial value to
populate the new column. The extra keyword arguments are the
arguments used to construct the new field.

If you are adding a column to a existing table, you will need to specify
either ``null=True`` on the field definition, or provide an initial value for
the new field. This is because after adding the field, every item in the 
database must be able to provide a value for the new field. Since existing 
rows will not have a value for the new field, you must either explicitly allow 
a value of NULL in the model definition, or you must provide an initial value.

The initial value can be a literal (e.g., ``initial=3`` for an `IntegerField`), 
or a callable that takes no arguments and returns a literal. If you use a 
literal, Django Evolution will apply any quoting required to protect the 
literal value. If you use a callable, the value returned by the callable will 
be inserted directly into the SQL ``INSERT`` statement that populates the new 
column, so it should be in a format appropriate for database insertion. 

If your newly added field definition provides a default, the default will be
used as the initial value.

Example::

    # Add a nickname field, allowing NULL values.
    AddField('Author', 'nickname', models.CharField, max_length=100, null=True)

    # Add a nickname field, specifying the nickname for all existing Authors
    # to be 'That Guy'
    AddField('Author', 'nickname', models.CharField, max_length=100, initial='That Guy')

    # Add a note field, using a callable for the initial value.
    # Note that the value returned by the callable has been wrapped in
    # quotes, so it is ready for database insertion.
    def initial_note_value():
        return "'Registered %s'" % datetime.now()
    AddField('Author', 'note', models.CharField, max_length=100, initial=initial_note_value)

ChangeField(model_name, field_name, initial=None, \*\*kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change database affecting attributes of the field 'field_name' to the model. Changes to the field
are specified using keyword arguments such as ''null=True'' and ''max_length=20''.

Like the AddField, if the change to the field is to set ''null=False'', an initial value must be provided
as either a literal or a callable. 

The ChangeField will accept both changed attributes and unchanged attributes of a field. This means
that if a CharField has a modified ''max_length'', its ''null'' attribute may be re-specified in the ChangeField.
The ChangeField will detect that the ''null'' attribute was specified but not changed and will silently ignore
the re-specification. 

Example::
    # Adding a null constraint to the birthday field.
    ChangeField('Author', 'date_of_birth', null=True)
    
    # Increasing the size of the author's name.
    ChangeField('Author', 'name', max_length=100)
    
.. note::
    The ChangeField only supports the null, max_length, unique, db_column, db_index and db_table (for many to many fields) attribute at this time.


DeleteField(model_name, field_name)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Remove the field 'field_name' from the model.

Example::

    # Remove the author's name from the Author model
    DeleteField('Author', 'name')

RenameField(model_name, old_field_name, new_field_name, db_column=None, db_table=None)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Rename the field 'old_name' to 'new_name'. 

``db_column`` and ``db_table`` are optional arguments; if provided, they
specify the new column name (in the case of a value or ForeignKey field) or
the new table name (in the case of a ManyToManyField). If you attempt to 
specify a table name for a normal or ForeignKey rename, an Exception will 
be raised; similarly, and exception will be raised if you attempt to specify 
a column name for a ManyToManyField rename. 

Example::

    # Rename the field 'name' as 'full_name'
    RenameField('Author', 'name', 'full_name')

DeleteModel(model_name)
~~~~~~~~~~~~~~~~~~~~~~~

Remove the model 'model_name' from the application.

Example::

    # Remove the Author model
    DeleteModel('Author')

SQLMutation(tag, sql, update_func=None)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Run a list of SQL statements as a mutation. This can be used as an alternative
to defining an mutation as an SQL file. By default, SQLMutations cannot be
simulated - therefore, you should carefully audit the SQL for any evolution
sequence including an SQLMutation before you execute that SQL on a database.

``tag`` is a text description of the changes. It will usually be the
evolution name.

``update_func`` is a optional function that will be invoked to update the
project signature when the mutation is simulated. This allows SQLMutations
to be included as part of a simulated evolution sequence.

An update function takes the application label and current project signature,
and mutates that signature until it reflects the changes implemented by the
SQL. For example, a simple mutation function to remove an 'Author' table
from an application might look something like::

    def my_update_func(app_label, project_signature):
        del project_signature[app_label]['Author']

If you provide an update function, it will be invoked as part of the
simulation sequence. Any discrepancies between the simulated and expected
project signatures will be reported as an error during the evolution process.

Example::

    # Add a new column to the Author table
    SQLMutation('add_location', ['ALTER TABLE blogette_author ADD COLUMN location varchar(100);'])

Defining your own mutations
---------------------------

If you have a special set of mutation requirements, you can define your own
mutation. All you need to do is subclass
``django_evolution.mutations.BaseMutation``, and include the mutation in a
stored evolution.

More to come
============

Django Evolution isn't complete - yet. There is still lots to do, and we'd
appreciate any help or suggestions you may have. The following are some examples
of features that we would like to add:

   1. Support for more mutation types, including:
       a. ChangeModel - to handle model constraints like ``unique_together``,
       b. ChangeData - to handle data changes without a corresponding field
          definition change.

   2. Improving support for the MySQL database backend

   3. Support for other database backends (Oracle...)

   4. Support for database introspection - the ability to look at the actual
      database structure to determine if evolutions are required.