Container

Container Auditor

class swift.container.auditor.ContainerAuditor(conf, logger=None)

Bases: swift.common.daemon.Daemon

Audit containers.

container_audit(path)

Audits the given container path

Parameters:path – the path to a container db
run_forever(*args, **kwargs)

Run the container audit until stopped.

run_once(*args, **kwargs)

Run the container audit once.

swift.container.auditor.random() → x in the interval [0, 1).

Container Backend

Pluggable Back-ends for Container Server

class swift.container.backend.ContainerBroker(db_file, timeout=25, logger=None, account=None, container=None, pending_timeout=None, stale_reads_ok=False)

Bases: swift.common.db.DatabaseBroker

Encapsulates working with a container database.

create_container_info_table(conn, put_timestamp, storage_policy_index)

Create the container_info table which is specific to the container DB. Not a part of Pluggable Back-ends, internal to the baseline code. Also creates the container_stat view.

Parameters:
  • conn – DB connection object
  • put_timestamp – put timestamp
  • storage_policy_index – storage policy index
create_object_table(conn)

Create the object table which is specific to the container DB. Not a part of Pluggable Back-ends, internal to the baseline code.

Parameters:conn – DB connection object
create_policy_stat_table(conn, storage_policy_index=0)

Create policy_stat table.

Parameters:
  • conn – DB connection object
  • storage_policy_index – the policy_index the container is being created with
db_contains_type = ‘object’
db_reclaim_timestamp = ‘created_at’
db_type = ‘container’
delete_object(name, timestamp, storage_policy_index=0)

Mark an object deleted.

Parameters:
  • name – object name to be deleted
  • timestamp – timestamp when the object was marked as deleted
  • storage_policy_index – the storage policy index for the object
empty()

Check if container DB is empty.

Returns:True if the database has no active objects, False otherwise
get_db_version(conn)
get_info()

Get global data for the container.

Returns:dict with keys: account, container, created_at, put_timestamp, delete_timestamp, status_changed_at, object_count, bytes_used, reported_put_timestamp, reported_delete_timestamp, reported_object_count, reported_bytes_used, hash, id, x_container_sync_point1, x_container_sync_point2, and storage_policy_index.
get_info_is_deleted()

Get the is_deleted status and info for the container.

Returns:a tuple, in the form (info, is_deleted) info is a dict as returned by get_info and is_deleted is a boolean.
get_misplaced_since(start, count)

Get a list of objects which are in a storage policy different from the container’s storage policy.

Parameters:
  • start – last reconciler sync point
  • count – maximum number of entries to get
Returns:

list of dicts with keys: name, created_at, size, content_type, etag, storage_policy_index

get_policy_stats()
get_reconciler_sync()
has_multiple_policies()
list_objects_iter(limit, marker, end_marker, prefix, delimiter, path=None, storage_policy_index=0, reverse=False)

Get a list of objects sorted by name starting at marker onward, up to limit entries. Entries will begin with the prefix and will not have the delimiter after the prefix.

Parameters:
  • limit – maximum number of entries to get
  • marker – marker query
  • end_marker – end marker query
  • prefix – prefix query
  • delimiter – delimiter for query
  • path – if defined, will set the prefix and delimiter based on the path
  • storage_policy_index – storage policy index for query
  • reverse – reverse the result order.
Returns:

list of tuples of (name, created_at, size, content_type, etag)

make_tuple_for_pickle(record)
merge_items(item_list, source=None)

Merge items into the object table.

Parameters:
  • item_list – list of dictionaries of {‘name’, ‘created_at’, ‘size’, ‘content_type’, ‘etag’, ‘deleted’, ‘storage_policy_index’, ‘ctype_timestamp’, ‘meta_timestamp’}
  • source – if defined, update incoming_sync with the source
put_object(name, timestamp, size, content_type, etag, deleted=0, storage_policy_index=0, ctype_timestamp=None, meta_timestamp=None)

Creates an object in the DB with its metadata.

Parameters:
  • name – object name to be created
  • timestamp – timestamp of when the object was created
  • size – object size
  • content_type – object content-type
  • etag – object etag
  • deleted – if True, marks the object as deleted and sets the deleted_at timestamp to timestamp
  • storage_policy_index – the storage policy index for the object
  • ctype_timestamp – timestamp of when content_type was last updated
  • meta_timestamp – timestamp of when metadata was last updated
reported(put_timestamp, delete_timestamp, object_count, bytes_used)

Update reported stats, available with container’s get_info.

Parameters:
  • put_timestamp – put_timestamp to update
  • delete_timestamp – delete_timestamp to update
  • object_count – object_count to update
  • bytes_used – bytes_used to update
set_storage_policy_index(policy_index, timestamp=None)

Update the container_stat policy_index and status_changed_at.

set_x_container_sync_points(sync_point1, sync_point2)
storage_policy_index
update_reconciler_sync(point)
swift.container.backend.update_new_item_from_existing(new_item, existing)

Compare the data and meta related timestamps of a new object item with the timestamps of an existing object record, and update the new item with data and/or meta related attributes from the existing record if their timestamps are newer.

The multiple timestamps are encoded into a single string for storing in the ‘created_at’ column of the objects db table.

Parameters:
  • new_item – A dict of object update attributes
  • existing – A dict of existing object attributes
Returns:

True if any attributes of the new item dict were found to be newer than the existing and therefore not updated, otherwise False implying that the updated item is equal to the existing.

Container Server

class swift.container.server.ContainerController(conf, logger=None)

Bases: swift.common.base_storage_server.BaseStorageServer

WSGI Controller for the container server.

DELETE(ctrl, *args, **kwargs)

Handle HTTP DELETE request.

GET(ctrl, *args, **kwargs)

Handle HTTP GET request.

HEAD(ctrl, *args, **kwargs)

Handle HTTP HEAD request.

POST(ctrl, *args, **kwargs)

Handle HTTP POST request.

PUT(ctrl, *args, **kwargs)

Handle HTTP PUT request.

REPLICATE(ctrl, *args, **kwargs)

Handle HTTP REPLICATE request (json-encoded RPC calls for replication.)

account_update(req, account, container, broker)

Update the account server(s) with latest container info.

Parameters:
  • req – swob.Request object
  • account – account name
  • container – container name
  • broker – container DB broker object
Returns:

if all the account requests return a 404 error code, HTTPNotFound response object, if the account cannot be updated due to a malformed header, an HTTPBadRequest response object, otherwise None.

allowed_sync_hosts = None

The list of hosts we’re allowed to send syncs to. This can be overridden by data in self.realms_conf

create_listing(req, out_content_type, info, resp_headers, metadata, container_list, container)
get_and_validate_policy_index(req)

Validate that the index supplied maps to a policy.

Returns:policy index from request, or None if not present
Raises:HTTPBadRequest – if the supplied index is bogus
realms_conf = None

ContainerSyncCluster instance for validating sync-to values.

save_headers = [‘x-container-read’, ‘x-container-write’, ‘x-container-sync-key’, ‘x-container-sync-to’]
server_type = ‘container-server’
update_data_record(record)

Perform any mutations to container listing records that are common to all serialization formats, and returns it as a dict.

Converts created time to iso timestamp. Replaces size with ‘swift_bytes’ content type parameter.

Params record:object entry record
Returns:modified record
swift.container.server.app_factory(global_conf, **local_conf)

paste.deploy app factory for creating WSGI container server apps

swift.container.server.gen_resp_headers(info, is_deleted=False)

Convert container info dict to headers.

Container Replicator

class swift.container.replicator.ContainerReplicator(conf, logger=None)

Bases: swift.common.db_replicator.Replicator

brokerclass

alias of ContainerBroker

datadir = ‘containers’
default_port = 6201
delete_db(broker)

Ensure that reconciler databases are only cleaned up at the end of the replication run.

dump_to_reconciler(broker, point)

Look for object rows for objects updates in the wrong storage policy in broker with a ROWID greater than the rowid given as point.

Parameters:
  • broker – the container broker with misplaced objects
  • point – the last verified reconciler_sync_point
Returns:

the last successful enqueued rowid

feed_reconciler(container, item_list)

Add queue entries for rows in item_list to the local reconciler container database.

Parameters:
  • container – the name of the reconciler container
  • item_list – the list of rows to enqueue
Returns:

True if successfully enqueued

find_local_handoff_for_part(part)

Look through devices in the ring for the first handoff device that was identified during job creation as available on this node.

Returns:a node entry from the ring
get_reconciler_broker(timestamp)

Get a local instance of the reconciler container broker that is appropriate to enqueue the given timestamp.

Parameters:timestamp – the timestamp of the row to be enqueued
Returns:a local reconciler broker
replicate_reconcilers()

Ensure any items merged to reconciler containers during replication are pushed out to correct nodes and any reconciler containers that do not belong on this node are removed.

report_up_to_date(full_info)
run_once(*args, **kwargs)
server_type = ‘container’
class swift.container.replicator.ContainerReplicatorRpc(root, datadir, broker_class, mount_check=True, logger=None)

Bases: swift.common.db_replicator.ReplicatorRpc

Container Sync

class swift.container.sync.ContainerSync(conf, container_ring=None, logger=None)

Bases: swift.common.daemon.Daemon

Daemon to sync syncable containers.

This is done by scanning the local devices for container databases and checking for x-container-sync-to and x-container-sync-key metadata values. If they exist, newer rows since the last sync will trigger PUTs or DELETEs to the other container.

The actual syncing is slightly more complicated to make use of the three (or number-of-replicas) main nodes for a container without each trying to do the exact same work but also without missing work if one node happens to be down.

Two sync points are kept per container database. All rows between the two sync points trigger updates. Any rows newer than both sync points cause updates depending on the node’s position for the container (primary nodes do one third, etc. depending on the replica count of course). After a sync run, the first sync point is set to the newest ROWID known and the second sync point is set to newest ROWID for which all updates have been sent.

An example may help. Assume replica count is 3 and perfectly matching ROWIDs starting at 1.

First sync run, database has 6 rows:

  • SyncPoint1 starts as -1.
  • SyncPoint2 starts as -1.
  • No rows between points, so no “all updates” rows.
  • Six rows newer than SyncPoint1, so a third of the rows are sent by node 1, another third by node 2, remaining third by node 3.
  • SyncPoint1 is set as 6 (the newest ROWID known).
  • SyncPoint2 is left as -1 since no “all updates” rows were synced.

Next sync run, database has 12 rows:

  • SyncPoint1 starts as 6.
  • SyncPoint2 starts as -1.
  • The rows between -1 and 6 all trigger updates (most of which should short-circuit on the remote end as having already been done).
  • Six more rows newer than SyncPoint1, so a third of the rows are sent by node 1, another third by node 2, remaining third by node 3.
  • SyncPoint1 is set as 12 (the newest ROWID known).
  • SyncPoint2 is set as 6 (the newest “all updates” ROWID).

In this way, under normal circumstances each node sends its share of updates each run and just sends a batch of older updates to ensure nothing was missed.

Parameters:
  • conf – The dict of configuration values from the [container-sync] section of the container-server.conf
  • container_ring – If None, the <swift_dir>/container.ring.gz will be loaded. This is overridden by unit tests.
allowed_sync_hosts = None

The list of hosts we’re allowed to send syncs to. This can be overridden by data in self.realms_conf

conf = None

The dict of configuration values from the [container-sync] section of the container-server.conf.

container_deletes = None

Number of successful DELETEs triggered.

container_failures = None

Number of containers that had a failure of some type.

container_puts = None

Number of successful PUTs triggered.

container_report(start, end, sync_point1, sync_point2, info, max_row)
container_ring = None

swift.common.ring.Ring for locating containers.

container_skips = None

Number of containers whose sync has been turned off, but are not yet cleared from the sync store.

container_stats = None

Per container stats. These are collected per container. puts - the number of puts that were done for the container deletes - the number of deletes that were fot the container bytes - the total number of bytes transferred per the container

container_sync(path)

Checks the given path for a container database, determines if syncing is turned on for that database and, if so, sends any updates to the other container.

Parameters:path – the path to a container db
container_sync_row(row, sync_to, user_key, broker, info, realm, realm_key)

Sends the update the row indicates to the sync_to container. Update can be either delete or put.

Parameters:
  • row – The updated row in the local database triggering the sync update.
  • sync_to – The URL to the remote container.
  • user_key – The X-Container-Sync-Key to use when sending requests to the other container.
  • broker – The local container database broker.
  • info – The get_info result from the local container database broker.
  • realm – The realm from self.realms_conf, if there is one. If None, fallback to using the older allowed_sync_hosts way of syncing.
  • realm_key – The realm key from self.realms_conf, if there is one. If None, fallback to using the older allowed_sync_hosts way of syncing.
Returns:

True on success

container_syncs = None

Number of containers with sync turned on that were successfully synced.

container_time = None

Maximum amount of time to spend syncing a container before moving on to the next one. If a container sync hasn’t finished in this time, it’ll just be resumed next scan.

devices = None

Path to the local device mount points.

interval = None

Minimum time between full scans. This is to keep the daemon from running wild on near empty systems.

logger = None

Logger to use for container-sync log lines.

mount_check = None

Indicates whether mount points should be verified as actual mount points (normally true, false for tests and SAIO).

realms_conf = None

ContainerSyncCluster instance for validating sync-to values.

report()

Writes a report of the stats to the logger and resets the stats for the next report.

reported = None

Time of last stats report.

run_forever(*args, **kwargs)

Runs container sync scans until stopped.

run_once(*args, **kwargs)

Runs a single container sync scan.

select_http_proxy()
sync_store = None

ContainerSyncStore instance for iterating over synced containers

swift.container.sync.random() → x in the interval [0, 1).

Container Updater

class swift.container.updater.ContainerUpdater(conf)

Bases: swift.common.daemon.Daemon

Update container information in account listings.

container_report(node, part, container, put_timestamp, delete_timestamp, count, bytes, storage_policy_index)

Report container info to an account server.

Parameters:
  • node – node dictionary from the account ring
  • part – partition the account is on
  • container – container name
  • put_timestamp – put timestamp
  • delete_timestamp – delete timestamp
  • count – object count in the container
  • bytes – bytes used in the container
  • storage_policy_index – the policy index for the container
container_sweep(path)

Walk the path looking for container DBs and process them.

Parameters:path – path to walk
get_account_ring()

Get the account ring. Load it if it hasn’t been yet.

get_paths()

Get paths to all of the partitions on each drive to be processed.

Returns:a list of paths
process_container(dbfile)

Process a container, and update the information in the account.

Parameters:dbfile – container DB to process
run_forever(*args, **kwargs)

Run the updater continuously.

run_once(*args, **kwargs)

Run the updater once.

swift.container.updater.random() → x in the interval [0, 1).