There are 3 layers of data munging happening in the import code.

importLib.py has a representation of each high-level object (ChannelFamily,
Channel, Errata, Package etc). The representation defines the properties and
the types for each of those properties. This should be really _the_ data model
for us.

backendOracle.py has an Oracle representation of that data model, i.e. a
mapping to the table structures we chose to use. Some conversions may be
necessary to go from one to the other. It also defines the primary keys for
each of the tables, necessary to perform the lookups properly.

Also, each table has a severityHash dictionary, describing how "fatal" a
difference in one of those fields is. I will explain this later.

The third layer (but it is really the first if you think about how stuff gets
called) is:

<foo><object>Source.py

bugzillaErrataSource.py, headerSource.py etc.

This layer is responsible for reading the data from the source to the
importLib data object.

The business logic is supposed to happen in the *Import.py files, while all
the database access should be in backend.py (but I don't think it's very
consistent).


In general, a data object (like a Package) will span across multiple tables.
In the case of a package, it spans over rhnPackage,
rhnPackage{Provides,Requires,Conflicts,Obsoletes,Files}, rhnChannelPackage
(when channel subscription is requested) etc. Therefore, we needed a way to
generically persist an object in the database using the nested table
structures we have in the relational DB.

One problem showing up is, how do you reconcile differences?
For instance, most of our tables have a "created" and "modified" field (and
some of them have a "last_modified" field, to track the change of the object
itself, instead of the row in the table). While it is important to control the
data in the last_modified field, it doesn't necessarily participate in the
comparison between an incoming object and its current representation in the
database (if one exists).

So, in rough lines, the code will:
    - read the data from the data source and convert it into the proper object
    - using the object's primary key attributes, look it up in the database
    - if the object doesn't exist, it initializes all the IDs for the primary
      keys from sequences, and proceeds to insert it into the DB
    - if the object does exist, it will check to see how different it is from
      the incoming one.

Each "import" process has an uploadForce and an ignoreUploaded flag associated
with it. These are all documented in __processObjectCollection__ (and some of
the documentation exists there already).

The uploadForce is the amount of karma you have in order to go over
differences on fields with various severity levels. Looking at severityHash in
backendOracle.py, you'll notice several values for severities:
    0 means "if things are different in this field, just ignore them"
    1 means "almost the same, difference can be ignored" (only used for 
        packages and only for the path). Two packages that are identical
        except for the path are identical, but sometimes you may still want to
        fix the path.
    2 means "same object, different signature" (only for packages too). If
      build_time and build_host are the same (as well as everything else except
      for checksum and package_size), then the package was resigned.
    3 means "everything the same, except build_time and build_host". Probably
      package recompiled.
    4 (default) means something critical changed.

You'll notice that last_modified is 0.5, I think I needed it that way for a
reason I don't quite remember - probably in the satellite sync code.

As the comments in __processObjectCollection suggest, the diff object
generated as a result of a comparison will draw its severity as the max of the
severities in each field.

The upload force has to be greater than or equal to the severity diff.

One other thing that proved to be a major performance factor was multi-row
inserts. You will notice processObjectCollection returns a DML object, which
has the changes to perform on each table. The changes will be performed one
table at the time, using executemany (mapped into Oracle's OCI interface to
perform bulk inserts/deletes/updates).
