third_party/cherrypy/lib/cpstats.py - Issue 9368042: Add CherryPy to third_party.

Unified Diff: third_party/cherrypy/lib/cpstats.py

Issue 9368042: Add CherryPy to third_party. (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/tools/build/

Patch Set: '' Created 8 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Index: third_party/cherrypy/lib/cpstats.py

===================================================================

--- third_party/cherrypy/lib/cpstats.py (revision 0)

+++ third_party/cherrypy/lib/cpstats.py (revision 0)

@@ -0,0 +1,662 @@

+"""CPStats, a package for collecting and reporting on program statistics.

+Overview

+========

+Statistics about program operation are an invaluable monitoring and debugging

+tool. Unfortunately, the gathering and reporting of these critical values is

+usually ad-hoc. This package aims to add a centralized place for gathering

+statistical performance data, a structure for recording that data which

+provides for extrapolation of that data into more useful information,

+and a method of serving that data to both human investigators and

+monitoring software. Let's examine each of those in more detail.

+Data Gathering

+--------------

+Just as Python's `logging` module provides a common importable for gathering

+and sending messages, performance statistics would benefit from a similar

+common mechanism, and one that does *not* require each package which wishes

+to collect stats to import a third-party module. Therefore, we choose to

+re-use the `logging` module by adding a `statistics` object to it.

+That `logging.statistics` object is a nested dict. It is not a custom class,

+because that would 1) require libraries and applications to import a third-

+party module in order to participate, 2) inhibit innovation in extrapolation

+approaches and in reporting tools, and 3) be slow. There are, however, some

+specifications regarding the structure of the dict.

+ {

+ +----"SQLAlchemy": {

+ | "Inserts": 4389745,

+ | "Inserts per Second":

+ | lambda s: s["Inserts"] / (time() - s["Start"]),

+ | C +---"Table Statistics": {

+ | o | "widgets": {-----------+

+ N | l | "Rows": 1.3M, | Record

+ a | l | "Inserts": 400, |

+ m | e | },---------------------+

+ e | c | "froobles": {

+ s | t | "Rows": 7845,

+ p | i | "Inserts": 0,

+ a | o | },

+ c | n +---},

+ e | "Slow Queries":

+ | [{"Query": "SELECT * FROM widgets;",

+ | "Processing Time": 47.840923343,

+ | },

+ | ],

+ +----},

+ }

+The `logging.statistics` dict has four levels. The topmost level is nothing

+more than a set of names to introduce modularity, usually along the lines of

+package names. If the SQLAlchemy project wanted to participate, for example,

+it might populate the item `logging.statistics['SQLAlchemy']`, whose value

+would be a second-layer dict we call a "namespace". Namespaces help multiple

+packages to avoid collisions over key names, and make reports easier to read,

+to boot. The maintainers of SQLAlchemy should feel free to use more than one

+namespace if needed (such as 'SQLAlchemy ORM'). Note that there are no case

+or other syntax constraints on the namespace names; they should be chosen

+to be maximally readable by humans (neither too short nor too long).

+Each namespace, then, is a dict of named statistical values, such as

+'Requests/sec' or 'Uptime'. You should choose names which will look

+good on a report: spaces and capitalization are just fine.

+In addition to scalars, values in a namespace MAY be a (third-layer)

+dict, or a list, called a "collection". For example, the CherryPy StatsTool

+keeps track of what each request is doing (or has most recently done)

+in a 'Requests' collection, where each key is a thread ID; each

+value in the subdict MUST be a fourth dict (whew!) of statistical data about

+each thread. We call each subdict in the collection a "record". Similarly,

+the StatsTool also keeps a list of slow queries, where each record contains

+data about each slow query, in order.

+Values in a namespace or record may also be functions, which brings us to:

+Extrapolation

+-------------

+The collection of statistical data needs to be fast, as close to unnoticeable

+as possible to the host program. That requires us to minimize I/O, for example,

+but in Python it also means we need to minimize function calls. So when you

+are designing your namespace and record values, try to insert the most basic

+scalar values you already have on hand.

+When it comes time to report on the gathered data, however, we usually have

+much more freedom in what we can calculate. Therefore, whenever reporting

+tools (like the provided StatsPage CherryPy class) fetch the contents of

+`logging.statistics` for reporting, they first call `extrapolate_statistics`

+(passing the whole `statistics` dict as the only argument). This makes a

+deep copy of the statistics dict so that the reporting tool can both iterate

+over it and even change it without harming the original. But it also expands

+any functions in the dict by calling them. For example, you might have a

+'Current Time' entry in the namespace with the value "lambda scope: time.time()".

+The "scope" parameter is the current namespace dict (or record, if we're

+currently expanding one of those instead), allowing you access to existing

+static entries. If you're truly evil, you can even modify more than one entry

+at a time.

+However, don't try to calculate an entry and then use its value in further

+extrapolations; the order in which the functions are called is not guaranteed.

+This can lead to a certain amount of duplicated work (or a redesign of your

+schema), but that's better than complicating the spec.

+After the whole thing has been extrapolated, it's time for:

+Reporting

+---------

+The StatsPage class grabs the `logging.statistics` dict, extrapolates it all,

+and then transforms it to HTML for easy viewing. Each namespace gets its own

+header and attribute table, plus an extra table for each collection. This is

+NOT part of the statistics specification; other tools can format how they like.

+You can control which columns are output and how they are formatted by updating

+StatsPage.formatting, which is a dict that mirrors the keys and nesting of

+`logging.statistics`. The difference is that, instead of data values, it has

+formatting values. Use None for a given key to indicate to the StatsPage that a

+given column should not be output. Use a string with formatting (such as '%.3f')

+to interpolate the value(s), or use a callable (such as lambda v: v.isoformat())

+for more advanced formatting. Any entry which is not mentioned in the formatting

+dict is output unchanged.

+Monitoring

+----------

+Although the HTML output takes pains to assign unique id's to each <td> with

+statistical data, you're probably better off fetching /cpstats/data, which

+outputs the whole (extrapolated) `logging.statistics` dict in JSON format.

+That is probably easier to parse, and doesn't have any formatting controls,

+so you get the "original" data in a consistently-serialized format.

+Note: there's no treatment yet for datetime objects. Try time.time() instead

+for now if you can. Nagios will probably thank you.

+Turning Collection Off

+----------------------

+It is recommended each namespace have an "Enabled" item which, if False,

+stops collection (but not reporting) of statistical data. Applications

+SHOULD provide controls to pause and resume collection by setting these

+entries to False or True, if present.

+Usage

+=====

+To collect statistics on CherryPy applications:

+ from cherrypy.lib import cpstats

+ appconfig['/']['tools.cpstats.on'] = True

+To collect statistics on your own code:

+ import logging

+ # Initialize the repository

+ if not hasattr(logging, 'statistics'): logging.statistics = {}

+ # Initialize my namespace

+ mystats = logging.statistics.setdefault('My Stuff', {})

+ # Initialize my namespace's scalars and collections

+ mystats.update({

+ 'Enabled': True,

+ 'Start Time': time.time(),

+ 'Important Events': 0,

+ 'Events/Second': lambda s: (

+ (s['Important Events'] / (time.time() - s['Start Time']))),

+ })

+ ...

+ for event in events:

+ ...

+ # Collect stats

+ if mystats.get('Enabled', False):

+ mystats['Important Events'] += 1

+To report statistics:

+ root.cpstats = cpstats.StatsPage()

+To format statistics reports:

+ See 'Reporting', above.

+"""

+# -------------------------------- Statistics -------------------------------- #

+import logging

+if not hasattr(logging, 'statistics'): logging.statistics = {}

+def extrapolate_statistics(scope):

+ """Return an extrapolated copy of the given scope."""

+ c = {}

+ for k, v in list(scope.items()):

+ if isinstance(v, dict):

+ v = extrapolate_statistics(v)

+ elif isinstance(v, (list, tuple)):

+ v = [extrapolate_statistics(record) for record in v]

+ elif hasattr(v, '__call__'):

+ v = v(scope)

+ c[k] = v

+ return c

+# --------------------- CherryPy Applications Statistics --------------------- #

+import threading

+import time

+import cherrypy

+appstats = logging.statistics.setdefault('CherryPy Applications', {})

+appstats.update({

+ 'Enabled': True,

+ 'Bytes Read/Request': lambda s: (s['Total Requests'] and

+ (s['Total Bytes Read'] / float(s['Total Requests'])) or 0.0),

+ 'Bytes Read/Second': lambda s: s['Total Bytes Read'] / s['Uptime'](s),

+ 'Bytes Written/Request': lambda s: (s['Total Requests'] and

+ (s['Total Bytes Written'] / float(s['Total Requests'])) or 0.0),

+ 'Bytes Written/Second': lambda s: s['Total Bytes Written'] / s['Uptime'](s),

+ 'Current Time': lambda s: time.time(),

+ 'Current Requests': 0,

+ 'Requests/Second': lambda s: float(s['Total Requests']) / s['Uptime'](s),

+ 'Server Version': cherrypy.__version__,

+ 'Start Time': time.time(),

+ 'Total Bytes Read': 0,

+ 'Total Bytes Written': 0,

+ 'Total Requests': 0,

+ 'Total Time': 0,

+ 'Uptime': lambda s: time.time() - s['Start Time'],

+ 'Requests': {},

+ })

+proc_time = lambda s: time.time() - s['Start Time']

+class ByteCountWrapper(object):

+ """Wraps a file-like object, counting the number of bytes read."""

+ def __init__(self, rfile):

+ self.rfile = rfile

+ self.bytes_read = 0

+ def read(self, size=-1):

+ data = self.rfile.read(size)

+ self.bytes_read += len(data)

+ return data

+ def readline(self, size=-1):

+ data = self.rfile.readline(size)

+ self.bytes_read += len(data)

+ return data

+ def readlines(self, sizehint=0):

+ # Shamelessly stolen from StringIO

+ total = 0

+ lines = []

+ line = self.readline()

+ while line:

+ lines.append(line)

+ total += len(line)

+ if 0 < sizehint <= total:

+ break

+ line = self.readline()

+ return lines

+ def close(self):

+ self.rfile.close()

+ def __iter__(self):

+ return self

+ def next(self):

+ data = self.rfile.next()

+ self.bytes_read += len(data)

+ return data

+average_uriset_time = lambda s: s['Count'] and (s['Sum'] / s['Count']) or 0

+class StatsTool(cherrypy.Tool):

+ """Record various information about the current request."""

+ def __init__(self):

+ cherrypy.Tool.__init__(self, 'on_end_request', self.record_stop)

+ def _setup(self):

+ """Hook this tool into cherrypy.request.

+ The standard CherryPy request object will automatically call this

+ method when the tool is "turned on" in config.

+ """

+ if appstats.get('Enabled', False):

+ cherrypy.Tool._setup(self)

+ self.record_start()

+ def record_start(self):

+ """Record the beginning of a request."""

+ request = cherrypy.serving.request

+ if not hasattr(request.rfile, 'bytes_read'):

+ request.rfile = ByteCountWrapper(request.rfile)

+ request.body.fp = request.rfile

+ r = request.remote

+ appstats['Current Requests'] += 1

+ appstats['Total Requests'] += 1

+ appstats['Requests'][threading._get_ident()] = {

+ 'Bytes Read': None,

+ 'Bytes Written': None,

+ # Use a lambda so the ip gets updated by tools.proxy later

+ 'Client': lambda s: '%s:%s' % (r.ip, r.port),

+ 'End Time': None,

+ 'Processing Time': proc_time,

+ 'Request-Line': request.request_line,

+ 'Response Status': None,

+ 'Start Time': time.time(),

+ }

+ def record_stop(self, uriset=None, slow_queries=1.0, slow_queries_count=100,

+ debug=False, **kwargs):

+ """Record the end of a request."""

+ resp = cherrypy.serving.response

+ w = appstats['Requests'][threading._get_ident()]

+ r = cherrypy.request.rfile.bytes_read

+ w['Bytes Read'] = r

+ appstats['Total Bytes Read'] += r

+ if resp.stream:

+ w['Bytes Written'] = 'chunked'

+ else:

+ cl = int(resp.headers.get('Content-Length', 0))

+ w['Bytes Written'] = cl

+ appstats['Total Bytes Written'] += cl

+ w['Response Status'] = getattr(resp, 'output_status', None) or resp.status

+ w['End Time'] = time.time()

+ p = w['End Time'] - w['Start Time']

+ w['Processing Time'] = p

+ appstats['Total Time'] += p

+ appstats['Current Requests'] -= 1

+ if debug:

+ cherrypy.log('Stats recorded: %s' % repr(w), 'TOOLS.CPSTATS')

+ if uriset:

+ rs = appstats.setdefault('URI Set Tracking', {})

+ r = rs.setdefault(uriset, {

+ 'Min': None, 'Max': None, 'Count': 0, 'Sum': 0,

+ 'Avg': average_uriset_time})

+ if r['Min'] is None or p < r['Min']:

+ r['Min'] = p

+ if r['Max'] is None or p > r['Max']:

+ r['Max'] = p

+ r['Count'] += 1

+ r['Sum'] += p

+ if slow_queries and p > slow_queries:

+ sq = appstats.setdefault('Slow Queries', [])

+ sq.append(w.copy())

+ if len(sq) > slow_queries_count:

+ sq.pop(0)

+import cherrypy

+cherrypy.tools.cpstats = StatsTool()

+# ---------------------- CherryPy Statistics Reporting ---------------------- #

+import os

+thisdir = os.path.abspath(os.path.dirname(__file__))

+try:

+ import json

+except ImportError:

+ try:

+ import simplejson as json

+ except ImportError:

+ json = None

+missing = object()

+locale_date = lambda v: time.strftime('%c', time.gmtime(v))

+iso_format = lambda v: time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(v))

+def pause_resume(ns):

+ def _pause_resume(enabled):

+ pause_disabled = ''

+ resume_disabled = ''

+ if enabled:

+ resume_disabled = 'disabled="disabled" '

+ else:

+ pause_disabled = 'disabled="disabled" '

+ return """

+ <form action="pause" method="POST" style="display:inline">

+ <input type="hidden" name="namespace" value="%s" />

+ <input type="submit" value="Pause" %s/>

+ </form>

+ <form action="resume" method="POST" style="display:inline">

+ <input type="hidden" name="namespace" value="%s" />

+ <input type="submit" value="Resume" %s/>

+ </form>

+ """ % (ns, pause_disabled, ns, resume_disabled)

+ return _pause_resume

+class StatsPage(object):

+ formatting = {

+ 'CherryPy Applications': {

+ 'Enabled': pause_resume('CherryPy Applications'),

+ 'Bytes Read/Request': '%.3f',

+ 'Bytes Read/Second': '%.3f',

+ 'Bytes Written/Request': '%.3f',

+ 'Bytes Written/Second': '%.3f',

+ 'Current Time': iso_format,

+ 'Requests/Second': '%.3f',

+ 'Start Time': iso_format,

+ 'Total Time': '%.3f',

+ 'Uptime': '%.3f',

+ 'Slow Queries': {

+ 'End Time': None,

+ 'Processing Time': '%.3f',

+ 'Start Time': iso_format,

+ },

+ 'URI Set Tracking': {

+ 'Avg': '%.3f',

+ 'Max': '%.3f',

+ 'Min': '%.3f',

+ 'Sum': '%.3f',

+ },

+ 'Requests': {

+ 'Bytes Read': '%s',

+ 'Bytes Written': '%s',

+ 'End Time': None,

+ 'Processing Time': '%.3f',

+ 'Start Time': None,

+ },

+ 'CherryPy WSGIServer': {

+ 'Enabled': pause_resume('CherryPy WSGIServer'),

+ 'Connections/second': '%.3f',

+ 'Start time': iso_format,

+ },

+ }

+ def index(self):

+ # Transform the raw data into pretty output for HTML

+ yield """

+<html>

+<head>

+ <title>Statistics</title>

+<style>

+th, td {

+ padding: 0.25em 0.5em;

+ border: 1px solid #666699;

+table {

+ border-collapse: collapse;

+table.stats1 {

+ width: 100%;

+table.stats1 th {

+ font-weight: bold;

+ text-align: right;

+ background-color: #CCD5DD;

+table.stats2, h2 {

+ margin-left: 50px;

+table.stats2 th {

+ font-weight: bold;

+ text-align: center;

+ background-color: #CCD5DD;

+</style>

+</head>

+<body>

+"""

+ for title, scalars, collections in self.get_namespaces():

+ yield """

+<h1>%s</h1>

+<table class='stats1'>

+ <tbody>

+""" % title

+ for i, (key, value) in enumerate(scalars):

+ colnum = i % 3

+ if colnum == 0: yield """

+ <tr>"""

+ yield """

+ <th>%(key)s</th><td id='%(title)s-%(key)s'>%(value)s</td>""" % vars()

+ if colnum == 2: yield """

+ </tr>"""

+ if colnum == 0: yield """

+ <th></th><td></td>

+ </tr>"""

+ elif colnum == 1: yield """

+ <th></th><td></td>

+ </tr>"""

+ yield """

+ </tbody>

+</table>"""

+ for subtitle, headers, subrows in collections:

+ yield """

+<h2>%s</h2>

+<table class='stats2'>

+ <thead>

+ <tr>""" % subtitle

+ for key in headers:

+ yield """

+ <th>%s</th>""" % key

+ yield """

+ </tr>

+ </thead>

+ <tbody>"""

+ for subrow in subrows:

+ yield """

+ <tr>"""

+ for value in subrow:

+ yield """

+ <td>%s</td>""" % value

+ yield """

+ </tr>"""

+ yield """

+ </tbody>

+</table>"""

+ yield """

+</body>

+</html>

+"""

+ index.exposed = True

+ def get_namespaces(self):

+ """Yield (title, scalars, collections) for each namespace."""

+ s = extrapolate_statistics(logging.statistics)

+ for title, ns in sorted(s.items()):

+ scalars = []

+ collections = []

+ ns_fmt = self.formatting.get(title, {})

+ for k, v in sorted(ns.items()):

+ fmt = ns_fmt.get(k, {})

+ if isinstance(v, dict):

+ headers, subrows = self.get_dict_collection(v, fmt)

+ collections.append((k, ['ID'] + headers, subrows))

+ elif isinstance(v, (list, tuple)):

+ headers, subrows = self.get_list_collection(v, fmt)

+ collections.append((k, headers, subrows))

+ else:

+ format = ns_fmt.get(k, missing)

+ if format is None:

+ # Don't output this column.

+ continue

+ if hasattr(format, '__call__'):

+ v = format(v)

+ elif format is not missing:

+ v = format % v

+ scalars.append((k, v))

+ yield title, scalars, collections

+ def get_dict_collection(self, v, formatting):

+ """Return ([headers], [rows]) for the given collection."""

+ # E.g., the 'Requests' dict.

+ headers = []

+ for record in v.itervalues():

+ for k3 in record:

+ format = formatting.get(k3, missing)

+ if format is None:

+ # Don't output this column.

+ continue

+ if k3 not in headers:

+ headers.append(k3)

+ headers.sort()

+ subrows = []

+ for k2, record in sorted(v.items()):

+ subrow = [k2]

+ for k3 in headers:

+ v3 = record.get(k3, '')

+ format = formatting.get(k3, missing)

+ if format is None:

+ # Don't output this column.

+ continue

+ if hasattr(format, '__call__'):

+ v3 = format(v3)

+ elif format is not missing:

+ v3 = format % v3

+ subrow.append(v3)

+ subrows.append(subrow)

+ return headers, subrows

+ def get_list_collection(self, v, formatting):

+ """Return ([headers], [subrows]) for the given collection."""

+ # E.g., the 'Slow Queries' list.

+ headers = []

+ for record in v:

+ for k3 in record:

+ format = formatting.get(k3, missing)

+ if format is None:

+ # Don't output this column.

+ continue

+ if k3 not in headers:

+ headers.append(k3)

+ headers.sort()

+ subrows = []

+ for record in v:

+ subrow = []

+ for k3 in headers:

+ v3 = record.get(k3, '')

+ format = formatting.get(k3, missing)

+ if format is None:

+ # Don't output this column.

+ continue

+ if hasattr(format, '__call__'):

+ v3 = format(v3)

+ elif format is not missing:

+ v3 = format % v3

+ subrow.append(v3)

+ subrows.append(subrow)

+ return headers, subrows

+ if json is not None:

+ def data(self):

+ s = extrapolate_statistics(logging.statistics)

+ cherrypy.response.headers['Content-Type'] = 'application/json'

+ return json.dumps(s, sort_keys=True, indent=4)

+ data.exposed = True

+ def pause(self, namespace):

+ logging.statistics.get(namespace, {})['Enabled'] = False

+ raise cherrypy.HTTPRedirect('./')

+ pause.exposed = True

+ pause.cp_config = {'tools.allow.on': True,

+ 'tools.allow.methods': ['POST']}

+ def resume(self, namespace):

+ logging.statistics.get(namespace, {})['Enabled'] = True

+ raise cherrypy.HTTPRedirect('./')

+ resume.exposed = True

+ resume.cp_config = {'tools.allow.on': True,

+ 'tools.allow.methods': ['POST']}

Property changes on: third_party/cherrypy/lib/cpstats.py

___________________________________________________________________

Added: svn:eol-style

+ LF

« no previous file with comments | « third_party/cherrypy/lib/covercp.py ('k') | third_party/cherrypy/lib/cptools.py » ('j') | no next file with comments »