filter/dscache/doc.go - Issue 1269113005: A transparent cache for datastore, backed by memcache.

Side by Side Diff: filter/dscache/doc.go

Issue 1269113005: A transparent cache for datastore, backed by memcache. (Closed) Base URL: https://github.com/luci/gae.git@add_meta

Patch Set: fix comments Created 5 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 // Copyright 2015 The Chromium Authors. All rights reserved.

	2 // Use of this source code is governed by a BSD-style license that can be

	3 // found in the LICENSE file.

	4

	5 // Package dscache provides a transparent cache for RawDatastore which is

	6 // backed by Memcache.

	7 //

	8 // Inspiration

	9 //

	10 // Although this is not a port of any particular implementation, it takes

	11 // inspiration from these fine libraries:

	12 // - https://cloud.google.com/appengine/docs/python/ndb/

	13 // - https://github.com/qedus/nds

	14 // - https://github.com/mjibson/goon

	15 //

	16 // Algorithm

	17 //

	18 // Memcache contains cache entries for single datastore entities. The memcache

	19 // key looks like

	20 //

	21 // "gae:" \| vers \| ":" \| shard \| ":" \| Base64_std_nopad(SHA1(datastore.Key))

	22 //

	23 // Where:

	24 // - vers is an ascii-hex-encoded number (currently 1).

	25 // - shard is a zero-based ascii-hex-encoded number (depends on shardsForKey).

	26 // - SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated

	27 // memcache sizes of up to 170 Exabytes (assuming an average entry size of

	28 // 100KB including the memcache key). This is clearly overkill, but MD5

	29 // could start showing collisions at this probability in as small as a 26GB

	30 // cache (and also MD5 sucks).

	31 //

	32 // The memcache value is a compression byte, indicating the scheme (See

	33 // CompressionType), followed by the encoded (and possibly compressed) value.

	34 // Encoding is done with datastore.PropertyMap.Write(). The memcache value

	35 // may also be the empty byte sequence, indicating that this entity is deleted.

	36 //

	37 // The memcache entry may also have a 'flags' value set to one of the following:

	38 // - 0 "entity" (cached value)

	39 // - 1 "lock" (someone is mutating this entry)

	40 //

	41 // Algorithm - Put and Delete

	42 //

	43 // On a Put (or Delete), an empty value is unconditionally written to

	44 // memcache with a LockTimeSeconds expiration (default 31 seconds), and

	45 // a memcache flag value of 0x1 (indicating that it's a put-locked key). The

	46 // random value is to preclude Get operations from believing that they possess

	47 // the lock.

	48 //

	49 // NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.

	50 //

	51 // The datastore operation will then occur. Assuming success, Put will then

	52 // unconditionally delete all of the memcache locks. At some point later, a

	53 // Get will write its own lock, get the value from datastore, and compare and

	54 // swap to populate the value (detailed below).

	55 //

	56 // Algorithm - Get

	57 //

	58 // On a Get, "Add" a lock for it (which only does something if there's no entry

	59 // in memcache yet) with a nonce value. We immediately Get the memcache entries

	60 // back (for CAS purposes later).

	61 //

	62 // If it doesn't exist (unlikely since we just Add'd it) or if its flag is

	63 // "lock" and the Value != the nonce we put there, go hit the datastore without

	64 // trying to update memcache.

	65 //

	66 // If its flag is "entity", decode the object and return it. If the Value is

	67 // the empty byte sequence, return ErrNoSuchEntity.

	68 //

	69 // If its flag is "lock" and the Value equals the nonce, go get it from the

	70 // datastore. If that's successful, then encode the value to bytes, and CAS

	71 // the object to memcache. The CAS will succeed if nothing else touched the

	72 // memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).

	73 //

	74 // Algorithm - Transactions

	75 //

	76 // In a transaction, all Put memcache operations are held until the very end of

	77 // the transaction. Right before the transaction is committed, all accumulated

	78 // Put memcache items are unconditionally set into memcache.

	79 //

	80 // NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.

	81 //

	82 // If the transaction is sucessfully committed (err == nil), then all the locks

	83 // will be deleted.

	84 //

	85 // The assumption here is that get operations apply all outstanding

	86 // transactions before they return data (https://cloud.google.com/appengine/docs /go/datastore/#Go_Datastore_writes_and_data_visibility),

	87 // and so it is safe to purge all the locks if the transaction is known-good.

	88 //

	89 // If the transaction succeeds, but RunInTransaction returns an error (which can

	90 // happen), or if the transaction fails, then the lock entries time out

	91 // naturally. This will mean 31-ish seconds of direct datastore access, but it's

	92 // the more-correct thing to do.

	93 //

	94 // Gets and Queries in a transaction pass right through without reading or

	95 // writing memcache.

	96 //

	97 // Cache control

	98 //

	99 // An entity may expose the following metadata (see

	100 // datastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.

	101 //

	102 // - `gae:"$dscache.enable,<true\|false>"` - whether or not this entity should

	103 // be cached at all. If ommitted, dscache defaults to true.

	104 // - `gae:"$dscache.expiration,#seconds"` - the number of seconds of

	105 // persistance to use when this item is cached. 0 is infinite. If omitted,

	106 // defaults to 0.

	107 //

	108 // In addition, the application may set a function shardsForKey(key) which

	109 // returns the number of shards to use for a given datastore key. This function

	110 // is set with the invocation of FilterRDS.

	111 //

	112 // Shards have the effect that all write (Put/Delete) operations clear all

	113 // memcache entries for the given datastore entry, and all reads read (and

	114 // possibly populate) one of the memcache entries. So if an entity has 4 shards,

	115 // a datastore Get for it will pull from one of the 4 possible memcache keys

	116 // at random. This is good for heavily-read, but infrequently updated, entities.

	117 // The purpose of sharding is to alleviate hot memcache keys, as recommended by

	118 // https://cloud.google.com/appengine/articles/best-practices-for-app-engine-mem cache#distribute-load .

	119 //

	120 // Caveats

	121 //

	122 // A couple things to note that may differ from other appengine datastore

	123 // caching libraries (like goon, nds, or ndb).

	124 //

	125 // - It does NOT provide in-memory ("per-request") caching.

	126 // - It's INtolerant of some memcache failures, but in exchange will not retur n

	127 // inconsistent results. See DANGER ZONE for details.

	128 // - Queries do not interact with the cache at all.

	129 // - Negative lookups (e.g. ErrNoSuchEntity) are cached.

	130 //

	131 // DANGER ZONE

	132 //

	133 // As mentioned in the Put/Delete/Transactions sections above, if the memcache

	134 // Set fails, that's a HARD ERROR. The reason for this is that otherwise in the

	135 // event of transient memcache failures, the memcache may be permanently left in

	136 // an inconsistent state, since there will be nothing to actually ensure that

	137 // the bad value is flushed from memcache. As long as the Put is allowed to

	138 // write the lock, then all will be (eventually) well, and so all other memcache

	139 // operations are best effort.

	140 //

	141 // So, if memcache is DOWN, you will effectively see tons of errors in the logs,

	142 // and all cached datastore access will be essentially degraded to a slow

	143 // read-only state. At this point, you have essentially 3 mitigration

	144 // strategies:

	145 // - wait for memcache to come back up.

	146 // - dynamically disable all memcache access by writing the datastore entry:

	147 // /dscache,1 = {"Enable": false}

	148 // in the default namespace. This can be done by invoking the

	149 // SetDynamicGlobalEnable method. This can take up to 5 minutes to take

	150 // effect. If you have very long-running backend requests, you may need to

	151 // cycle them to have it take effect. This dynamic bit is read essentially

	152 // once per http request (when FilteRDS is called on the context).

	153 // - push a new version of the application disabling the cache filter

	154 // by setting InstanceEnabledStatic to false in an init() function.

	155 //

	156 // On every dscache.FilterRDS invocation, it takes the opportunity to fetch this

	157 // datastore value, if it hasn't been fetched in the last

	158 // GlobalEnabledCheckInterval time (5 minutes). This equates to essentially once

	159 // per http request, per 5 minutes, per instance.

	160 package dscache

OLD	NEW

« no previous file with comments | « filter/dscache/context.go ('k') | filter/dscache/ds.go » ('j') | no next file with comments »