filter/dscache/doc.go - Issue 1269113005: A transparent cache for datastore, backed by memcache.

Side by Side Diff: filter/dscache/doc.go

Issue 1269113005: A transparent cache for datastore, backed by memcache. (Closed) Base URL: https://github.com/luci/gae.git@add_meta

Patch Set: some minor comments Created 5 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 // Copyright 2015 The Chromium Authors. All rights reserved.

	2 // Use of this source code is governed by a BSD-style license that can be

	3 // found in the LICENSE file.

	4

	5 // Package dscache provides a transparent cache for RawDatastore which is

	6 // backed by Memcache.

	7 //

	8 // Inspiration

	9 //

	10 // Although this is not a port of any particular implementation, it takes

	11 // inspiration from these fine libraries:

	12 // - https://cloud.google.com/appengine/docs/python/ndb/

	13 // - https://github.com/qedus/nds

	14 // - https://github.com/mjibson/goon

	15 //

	16 // Algorithm

	17 //

	18 // Memcache contains cache entries for single datastore entities. The memcache

	19 // key looks like

	20 //

	21 // "gae:" \| vers \| ":" \| shard \| ":" \| Base64_std_nopad(SHA1(datastore.Key))

	22 //

	23 // Where:

	24 // - vers is an ascii-hex-encoded number (currently 1).

	25 // - shard is a zero-based ascii-hex-encoded number (depends on shardsForKey).

	26 // - SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated

	27 // memcache sizes of up to 170 Exabytes (assuming an average entry size of

	28 // 100KB including the memcache key). This is clearly overkill, but MD5

	29 // could start showing collisions at this probability in as small as a 26GB

	30 // cache (and also MD5 sucks).

	31 //

	32 // The memcache value is a compression byte, indicating the scheme (See

	33 // CompressionType), followed by the encoded (and possibly compressed) value.

	34 // Encoding is done with datastore.PropertyMap.Write(). The memcache value

	35 // may also be the empty byte sequence, indicating that this entity is deleted.

	36 //

	37 // The memcache entry may also have a 'flags' value set to one of the following:

	38 // - 0 "entity" (cached value)

	39 // - 1 "lock" (someone is mutating this entry)

	40 //

	41 // Algorithm - Put and Delete

	42 //

	43 // On a Put (or Delete), an empty value is unconditionally written to

	44 // memcache with a LockTimeSeconds expiration (default 31 seconds), and

	45 // a memcache flag value of 0x1 (indicating that it's a put-locked key). The

	46 // random value is to preclude Get operations from believing that they possess

	47 // the lock.

	48 //

	49 // NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.

	50 //

	51 // The datastore operation will then occur. Assuming success, Put will then

	52 // unconditionally delete all of the memcache locks. At some point later, a

	53 // Get will write its own lock, get the value from datastore, and compare and

	54 // swap to populate the value (detailed below).

	55 //

	56 // Algorithm - Get

	57 //

	58 // On a Get, "Add" a lock for it (which only does something if there's no entry

	59 // in memcache yet) with a nonce value. We immediately Get the memcache entries

	60 // back (for CAS purposes later).

	61 //

	62 // If it doesn't exist (unlikely since we just Add'd it) or if its flag is

	63 // "lock" and the Value != the nonce we put there, go hit the datastore without

	64 // trying to update memcache.

	65 //

	66 // If its flag is "entity", decode the object and return it. If the Value is

	67 // the empty byte sequence, return ErrNoSuchEntity.

	68 //

	69 // If its flag is "lock" and the Value equals the nonce, go get it from the

	70 // datastore. If that's successful, then encode the value to bytes, and CAS

	71 // the object to memcache. The CAS will succeed if nothing else touched the

	72 // memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).

	73 //

	74 // Algorithm - Transactions

	75 //

	76 // In a transaction, all Put memcache operations are held until the very end of

	77 // the transaction. Right before the transaction is committed, all accumulated

	78 // Put memcache items are unconditionally set into memcache.

	79 //

	80 // NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.

	81 //

	82 // If the transaction is sucessfully committed (err == nil), then all the locks

	83 // will be deleted.

	84 //

	85 // The assumption here is that get operations apply all outstanding

	86 // transactions before they return data (https://cloud.google.com/appengine/docs /go/datastore/#Go_Datastore_writes_and_data_visibility),

	87 // and so it is safe to purge all the locks if the transaction is known-good.

	88 //

	89 // If the transaction succeeds, but RunInTransaction returns an error (which can

	90 // happen), or if the transaction fails, then the lock entries time out

	91 // naturally. This will mean 31-ish seconds of direct datastore access, but it's

	92 // the more-correct thing to do.

	93 //

	94 // Gets and Queries in a transaction pass right through without reading or

	95 // writing memcache.

	96 //

	97 // Cache control

	98 //

	99 // An entity may expose the following metadata (see

	100 // datastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.

	101 //

	102 // - `gae:"$dscache.enable,<true\|false>"` - whether or not this entity should

	103 // be cached at all. If ommitted, dscache defaults to true.

	104 // - `gae:"$dscache.expiration,#seconds"` - the number of seconds of

	105 // persistance to use when this item is cached. 0 is infinite. If omitted,

	106 // defaults to 0.

	107 //

	108 // In addition, the application may set a function shardsForKey(key) which

	109 // returns the number of shards to use for a given datastore key. This function

	110 // is set with the invocation of FilterRDS.

	111 //

	112 // Shards have the effect that all write (Put/Delete) operations clear all

	113 // memcache entries for the given datastore entry, and all reads read (and

	114 // possibly populate) one of the memcache entries. So if an entity has 4 shards,

	115 // a datastore Get for it will pull from one of the 4 possible memcache keys

	116 // at random. This is good for heavily-read, but infrequently updated, entities.
	Vadim Sh. 2015/08/07 17:30:47 nit: mention that it's needed to avoid having hot nit: mention that it's needed to avoid having hot keys in memcached, e.g. the contention happens in memcached guts, not in our code. You sort of mention this below, but it would be clearer here. iannucci 2015/08/07 19:41:00 Done. Show quoted text On 2015/08/07 17:30:47, Vadim Sh. wrote: > nit: mention that it's needed to avoid having hot keys in memcached, e.g. the > contention happens in memcached guts, not in our code. You sort of mention this > below, but it would be clearer here. Done.
	117 //

	118 // Caveats

	119 //

	120 // A couple things to note that may differ from other appengine datastore

	121 // caching libraries (like goon, nds, or ndb).

	122 //

	123 // - It does NOT provide in-memory ("per-request") caching.

	124 // - It's INtolerant of some memcache failures, but in exchange will not retur n

	125 // inconsistent results. See DANGER ZONE for details.

	126 // - Queries do not interact with the cache at all.

	127 // - Negative lookups (e.g. ErrNoSuchEntity) are cached.

	128 // - Allows sharding hot memcache entries as recommended by

	129 // https://cloud.google.com/appengine/articles/best-practices-for-app-engine -memcache#distribute-load .

	130 //

	131 // DANGER ZONE

	132 //

	133 // As mentioned in the Put/Delete/Transactions sections above, if the memcache

	134 // Set fails, that's a HARD ERROR. The reason for this is that otherwise in the

	135 // event of transient memcache failures, the memcache may be permanently left in

	136 // an inconsistent state, since there will be nothing to actually ensure that

	137 // the bad value is flushed from memcache. As long as the Put is allowed to

	138 // write the lock, then all will be (eventually) well, and so all other memcache

	139 // operations are best effort.

	140 //

	141 // So, if memcache is DOWN, you will effectively see tons of errors in the logs,

	142 // and all cached datastore access will be essentially degraded to a slow

	143 // read-only state. At this point, you have essentially 3 mitigration

	144 // strategies:

	145 // - wait for memcache to come back up.

	146 // - dynamically disable all memcache access by writing the datastore entry:

	147 // /dscache,1 = {"Enable": false}

	148 // in the default namespace. This can be done by invoking the

	149 // SetDynamicGlobalEnable method. This can take up to 5 minutes to take

	150 // effect. If you have very long-running backend requests, you may need to

	151 // cycle them to have it take effect. This dynamic bit is read essentially

	152 // once per http request (when FilteRDS is called on the context).

	153 // - push a new version of the application disabling the cache filter

	154 // by setting InstanceEnabledStatic to false in an init() function.

	155 //

	156 // On every dscache.FilterRDS invocation, it takes the opportunity to fetch this

	157 // datastore value, if it hasn't been fetched in the last

	158 // GlobalEnabledCheckInterval time (5 minutes). This equates to essentially once

	159 // per http request, per 5 minutes, per instance.

	160 package dscache

OLD	NEW

« no previous file with comments | « filter/dscache/context.go ('k') | filter/dscache/ds.go » ('j') | filter/dscache/dscache.go » ('J')