third_party/cherrypy/_cpreqbody.py - Issue 9368042: Add CherryPy to third_party.

Unified Diff: third_party/cherrypy/_cpreqbody.py

Issue 9368042: Add CherryPy to third_party. (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/tools/build/

Patch Set: '' Created 8 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Index: third_party/cherrypy/_cpreqbody.py

===================================================================

--- third_party/cherrypy/_cpreqbody.py (revision 0)

+++ third_party/cherrypy/_cpreqbody.py (revision 0)

@@ -0,0 +1,965 @@

+"""Request body processing for CherryPy.

+.. versionadded:: 3.2

+Application authors have complete control over the parsing of HTTP request

+entities. In short, :attr:`cherrypy.request.body<cherrypy._cprequest.Request.body>`

+is now always set to an instance of :class:`RequestBody<cherrypy._cpreqbody.RequestBody>`,

+and *that* class is a subclass of :class:`Entity<cherrypy._cpreqbody.Entity>`.

+When an HTTP request includes an entity body, it is often desirable to

+provide that information to applications in a form other than the raw bytes.

+Different content types demand different approaches. Examples:

+ * For a GIF file, we want the raw bytes in a stream.

+ * An HTML form is better parsed into its component fields, and each text field

+ decoded from bytes to unicode.

+ * A JSON body should be deserialized into a Python dict or list.

+When the request contains a Content-Type header, the media type is used as a

+key to look up a value in the

+:attr:`request.body.processors<cherrypy._cpreqbody.Entity.processors>` dict.

+If the full media

+type is not found, then the major type is tried; for example, if no processor

+is found for the 'image/jpeg' type, then we look for a processor for the 'image'

+types altogether. If neither the full type nor the major type has a matching

+processor, then a default processor is used

+(:func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>`). For most

+types, this means no processing is done, and the body is left unread as a

+raw byte stream. Processors are configurable in an 'on_start_resource' hook.

+Some processors, especially those for the 'text' types, attempt to decode bytes

+to unicode. If the Content-Type request header includes a 'charset' parameter,

+this is used to decode the entity. Otherwise, one or more default charsets may

+be attempted, although this decision is up to each processor. If a processor

+successfully decodes an Entity or Part, it should set the

+:attr:`charset<cherrypy._cpreqbody.Entity.charset>` attribute

+on the Entity or Part to the name of the successful charset, so that

+applications can easily re-encode or transcode the value if they wish.

+If the Content-Type of the request entity is of major type 'multipart', then

+the above parsing process, and possibly a decoding process, is performed for

+each part.

+For both the full entity and multipart parts, a Content-Disposition header may

+be used to fill :attr:`name<cherrypy._cpreqbody.Entity.name>` and

+:attr:`filename<cherrypy._cpreqbody.Entity.filename>` attributes on the

+request.body or the Part.

+.. _custombodyprocessors:

+Custom Processors

+=================

+You can add your own processors for any specific or major MIME type. Simply add

+it to the :attr:`processors<cherrypy._cprequest.Entity.processors>` dict in a

+hook/tool that runs at ``on_start_resource`` or ``before_request_body``.

+Here's the built-in JSON tool for an example::

+ def json_in(force=True, debug=False):

+ request = cherrypy.serving.request

+ def json_processor(entity):

+ \"""Read application/json data into request.json.\"""

+ if not entity.headers.get("Content-Length", ""):

+ raise cherrypy.HTTPError(411)

+ body = entity.fp.read()

+ try:

+ request.json = json_decode(body)

+ except ValueError:

+ raise cherrypy.HTTPError(400, 'Invalid JSON document')

+ if force:

+ request.body.processors.clear()

+ request.body.default_proc = cherrypy.HTTPError(

+ 415, 'Expected an application/json content type')

+ request.body.processors['application/json'] = json_processor

+We begin by defining a new ``json_processor`` function to stick in the ``processors``

+dictionary. All processor functions take a single argument, the ``Entity`` instance

+they are to process. It will be called whenever a request is received (for those

+URI's where the tool is turned on) which has a ``Content-Type`` of

+"application/json".

+First, it checks for a valid ``Content-Length`` (raising 411 if not valid), then

+reads the remaining bytes on the socket. The ``fp`` object knows its own length, so

+it won't hang waiting for data that never arrives. It will return when all data

+has been read. Then, we decode those bytes using Python's built-in ``json`` module,

+and stick the decoded result onto ``request.json`` . If it cannot be decoded, we

+raise 400.

+If the "force" argument is True (the default), the ``Tool`` clears the ``processors``

+dict so that request entities of other ``Content-Types`` aren't parsed at all. Since

+there's no entry for those invalid MIME types, the ``default_proc`` method of ``cherrypy.request.body``

+is called. But this does nothing by default (usually to provide the page handler an opportunity to handle it.)

+But in our case, we want to raise 415, so we replace ``request.body.default_proc``

+with the error (``HTTPError`` instances, when called, raise themselves).

+If we were defining a custom processor, we can do so without making a ``Tool``. Just add the config entry::

+ request.body.processors = {'application/json': json_processor}

+Note that you can only replace the ``processors`` dict wholesale this way, not update the existing one.

+"""

+try:

+ from io import DEFAULT_BUFFER_SIZE

+except ImportError:

+ DEFAULT_BUFFER_SIZE = 8192

+import re

+import sys

+import tempfile

+try:

+ from urllib import unquote_plus

+except ImportError:

+ def unquote_plus(bs):

+ """Bytes version of urllib.parse.unquote_plus."""

+ bs = bs.replace(ntob('+'), ntob(' '))

+ atoms = bs.split(ntob('%'))

+ for i in range(1, len(atoms)):

+ item = atoms[i]

+ try:

+ pct = int(item[:2], 16)

+ atoms[i] = bytes([pct]) + item[2:]

+ except ValueError:

+ pass

+ return ntob('').join(atoms)

+import cherrypy

+from cherrypy._cpcompat import basestring, ntob, ntou

+from cherrypy.lib import httputil

+# -------------------------------- Processors -------------------------------- #

+def process_urlencoded(entity):

+ """Read application/x-www-form-urlencoded data into entity.params."""

+ qs = entity.fp.read()

+ for charset in entity.attempt_charsets:

+ try:

+ params = {}

+ for aparam in qs.split(ntob('&')):

+ for pair in aparam.split(ntob(';')):

+ if not pair:

+ continue

+ atoms = pair.split(ntob('='), 1)

+ if len(atoms) == 1:

+ atoms.append(ntob(''))

+ key = unquote_plus(atoms[0]).decode(charset)

+ value = unquote_plus(atoms[1]).decode(charset)

+ if key in params:

+ if not isinstance(params[key], list):

+ params[key] = [params[key]]

+ params[key].append(value)

+ else:

+ params[key] = value

+ except UnicodeDecodeError:

+ pass

+ else:

+ entity.charset = charset

+ break

+ else:

+ raise cherrypy.HTTPError(

+ 400, "The request entity could not be decoded. The following "

+ "charsets were attempted: %s" % repr(entity.attempt_charsets))

+ # Now that all values have been successfully parsed and decoded,

+ # apply them to the entity.params dict.

+ for key, value in params.items():

+ if key in entity.params:

+ if not isinstance(entity.params[key], list):

+ entity.params[key] = [entity.params[key]]

+ entity.params[key].append(value)

+ else:

+ entity.params[key] = value

+def process_multipart(entity):

+ """Read all multipart parts into entity.parts."""

+ ib = ""

+ if 'boundary' in entity.content_type.params:

+ # http://tools.ietf.org/html/rfc2046#section-5.1.1

+ # "The grammar for parameters on the Content-type field is such that it

+ # is often necessary to enclose the boundary parameter values in quotes

+ # on the Content-type line"

+ ib = entity.content_type.params['boundary'].strip('"')

+ if not re.match("^[ -~]{0,200}[!-~]$", ib):

+ raise ValueError('Invalid boundary in multipart form: %r' % (ib,))

+ ib = ('--' + ib).encode('ascii')

+ # Find the first marker

+ while True:

+ b = entity.readline()

+ if not b:

+ return

+ b = b.strip()

+ if b == ib:

+ break

+ # Read all parts

+ while True:

+ part = entity.part_class.from_fp(entity.fp, ib)

+ entity.parts.append(part)

+ part.process()

+ if part.fp.done:

+ break

+def process_multipart_form_data(entity):

+ """Read all multipart/form-data parts into entity.parts or entity.params."""

+ process_multipart(entity)

+ kept_parts = []

+ for part in entity.parts:

+ if part.name is None:

+ kept_parts.append(part)

+ else:

+ if part.filename is None:

+ # It's a regular field

+ value = part.fullvalue()

+ else:

+ # It's a file upload. Retain the whole part so consumer code

+ # has access to its .file and .filename attributes.

+ value = part

+ if part.name in entity.params:

+ if not isinstance(entity.params[part.name], list):

+ entity.params[part.name] = [entity.params[part.name]]

+ entity.params[part.name].append(value)

+ else:

+ entity.params[part.name] = value

+ entity.parts = kept_parts

+def _old_process_multipart(entity):

+ """The behavior of 3.2 and lower. Deprecated and will be changed in 3.3."""

+ process_multipart(entity)

+ params = entity.params

+ for part in entity.parts:

+ if part.name is None:

+ key = ntou('parts')

+ else:

+ key = part.name

+ if part.filename is None:

+ # It's a regular field

+ value = part.fullvalue()

+ else:

+ # It's a file upload. Retain the whole part so consumer code

+ # has access to its .file and .filename attributes.

+ value = part

+ if key in params:

+ if not isinstance(params[key], list):

+ params[key] = [params[key]]

+ params[key].append(value)

+ else:

+ params[key] = value

+# --------------------------------- Entities --------------------------------- #

+class Entity(object):

+ """An HTTP request body, or MIME multipart body.

+ This class collects information about the HTTP request entity. When a

+ given entity is of MIME type "multipart", each part is parsed into its own

+ Entity instance, and the set of parts stored in

+ :attr:`entity.parts<cherrypy._cpreqbody.Entity.parts>`.

+ Between the ``before_request_body`` and ``before_handler`` tools, CherryPy

+ tries to process the request body (if any) by calling

+ :func:`request.body.process<cherrypy._cpreqbody.RequestBody.process`.

+ This uses the ``content_type`` of the Entity to look up a suitable processor

+ in :attr:`Entity.processors<cherrypy._cpreqbody.Entity.processors>`, a dict.

+ If a matching processor cannot be found for the complete Content-Type,

+ it tries again using the major type. For example, if a request with an

+ entity of type "image/jpeg" arrives, but no processor can be found for

+ that complete type, then one is sought for the major type "image". If a

+ processor is still not found, then the

+ :func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>` method of the

+ Entity is called (which does nothing by default; you can override this too).

+ CherryPy includes processors for the "application/x-www-form-urlencoded"

+ type, the "multipart/form-data" type, and the "multipart" major type.

+ CherryPy 3.2 processes these types almost exactly as older versions.

+ Parts are passed as arguments to the page handler using their

+ ``Content-Disposition.name`` if given, otherwise in a generic "parts"

+ argument. Each such part is either a string, or the

+ :class:`Part<cherrypy._cpreqbody.Part>` itself if it's a file. (In this

+ case it will have ``file`` and ``filename`` attributes, or possibly a

+ ``value`` attribute). Each Part is itself a subclass of

+ Entity, and has its own ``process`` method and ``processors`` dict.

+ There is a separate processor for the "multipart" major type which is more

+ flexible, and simply stores all multipart parts in

+ :attr:`request.body.parts<cherrypy._cpreqbody.Entity.parts>`. You can

+ enable it with::

+ cherrypy.request.body.processors['multipart'] = _cpreqbody.process_multipart

+ in an ``on_start_resource`` tool.

+ """

+ # http://tools.ietf.org/html/rfc2046#section-4.1.2:

+ # "The default character set, which must be assumed in the

+ # absence of a charset parameter, is US-ASCII."

+ # However, many browsers send data in utf-8 with no charset.

+ attempt_charsets = ['utf-8']

+ """A list of strings, each of which should be a known encoding.

+ When the Content-Type of the request body warrants it, each of the given

+ encodings will be tried in order. The first one to successfully decode the

+ entity without raising an error is stored as

+ :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults

+ to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by

+ `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_),

+ but ``['us-ascii', 'utf-8']`` for multipart parts.

+ """

+ charset = None

+ """The successful decoding; see "attempt_charsets" above."""

+ content_type = None

+ """The value of the Content-Type request header.

+ If the Entity is part of a multipart payload, this will be the Content-Type

+ given in the MIME headers for this part.

+ """

+ default_content_type = 'application/x-www-form-urlencoded'

+ """This defines a default ``Content-Type`` to use if no Content-Type header

+ is given. The empty string is used for RequestBody, which results in the

+ request body not being read or parsed at all. This is by design; a missing

+ ``Content-Type`` header in the HTTP request entity is an error at best,

+ and a security hole at worst. For multipart parts, however, the MIME spec

+ declares that a part with no Content-Type defaults to "text/plain"

+ (see :class:`Part<cherrypy._cpreqbody.Part>`).

+ """

+ filename = None

+ """The ``Content-Disposition.filename`` header, if available."""

+ fp = None

+ """The readable socket file object."""

+ headers = None

+ """A dict of request/multipart header names and values.

+ This is a copy of the ``request.headers`` for the ``request.body``;

+ for multipart parts, it is the set of headers for that part.

+ """

+ length = None

+ """The value of the ``Content-Length`` header, if provided."""

+ name = None

+ """The "name" parameter of the ``Content-Disposition`` header, if any."""

+ params = None

+ """

+ If the request Content-Type is 'application/x-www-form-urlencoded' or

+ multipart, this will be a dict of the params pulled from the entity

+ body; that is, it will be the portion of request.params that come

+ from the message body (sometimes called "POST params", although they

+ can be sent with various HTTP method verbs). This value is set between

+ the 'before_request_body' and 'before_handler' hooks (assuming that

+ process_request_body is True)."""

+ processors = {'application/x-www-form-urlencoded': process_urlencoded,

+ 'multipart/form-data': process_multipart_form_data,

+ 'multipart': process_multipart,

+ }

+ """A dict of Content-Type names to processor methods."""

+ parts = None

+ """A list of Part instances if ``Content-Type`` is of major type "multipart"."""

+ part_class = None

+ """The class used for multipart parts.

+ You can replace this with custom subclasses to alter the processing of

+ multipart parts.

+ """

+ def __init__(self, fp, headers, params=None, parts=None):

+ # Make an instance-specific copy of the class processors

+ # so Tools, etc. can replace them per-request.

+ self.processors = self.processors.copy()

+ self.fp = fp

+ self.headers = headers

+ if params is None:

+ params = {}

+ self.params = params

+ if parts is None:

+ parts = []

+ self.parts = parts

+ # Content-Type

+ self.content_type = headers.elements('Content-Type')

+ if self.content_type:

+ self.content_type = self.content_type[0]

+ else:

+ self.content_type = httputil.HeaderElement.from_str(

+ self.default_content_type)

+ # Copy the class 'attempt_charsets', prepending any Content-Type charset

+ dec = self.content_type.params.get("charset", None)

+ if dec:

+ self.attempt_charsets = [dec] + [c for c in self.attempt_charsets

+ if c != dec]

+ else:

+ self.attempt_charsets = self.attempt_charsets[:]

+ # Length

+ self.length = None

+ clen = headers.get('Content-Length', None)

+ # If Transfer-Encoding is 'chunked', ignore any Content-Length.

+ if clen is not None and 'chunked' not in headers.get('Transfer-Encoding', ''):

+ try:

+ self.length = int(clen)

+ except ValueError:

+ pass

+ # Content-Disposition

+ self.name = None

+ self.filename = None

+ disp = headers.elements('Content-Disposition')

+ if disp:

+ disp = disp[0]

+ if 'name' in disp.params:

+ self.name = disp.params['name']

+ if self.name.startswith('"') and self.name.endswith('"'):

+ self.name = self.name[1:-1]

+ if 'filename' in disp.params:

+ self.filename = disp.params['filename']

+ if self.filename.startswith('"') and self.filename.endswith('"'):

+ self.filename = self.filename[1:-1]

+ # The 'type' attribute is deprecated in 3.2; remove it in 3.3.

+ type = property(lambda self: self.content_type,

+ doc="""A deprecated alias for :attr:`content_type<cherrypy._cpreqbody.Entity.content_type>`.""")

+ def read(self, size=None, fp_out=None):

+ return self.fp.read(size, fp_out)

+ def readline(self, size=None):

+ return self.fp.readline(size)

+ def readlines(self, sizehint=None):

+ return self.fp.readlines(sizehint)

+ def __iter__(self):

+ return self

+ def __next__(self):

+ line = self.readline()

+ if not line:

+ raise StopIteration

+ return line

+ def next(self):

+ return self.__next__()

+ def read_into_file(self, fp_out=None):

+ """Read the request body into fp_out (or make_file() if None). Return fp_out."""

+ if fp_out is None:

+ fp_out = self.make_file()

+ self.read(fp_out=fp_out)

+ return fp_out

+ def make_file(self):

+ """Return a file-like object into which the request body will be read.

+ By default, this will return a TemporaryFile. Override as needed.

+ See also :attr:`cherrypy._cpreqbody.Part.maxrambytes`."""

+ return tempfile.TemporaryFile()

+ def fullvalue(self):

+ """Return this entity as a string, whether stored in a file or not."""

+ if self.file:

+ # It was stored in a tempfile. Read it.

+ self.file.seek(0)

+ value = self.file.read()

+ self.file.seek(0)

+ else:

+ value = self.value

+ return value

+ def process(self):

+ """Execute the best-match processor for the given media type."""

+ proc = None

+ ct = self.content_type.value

+ try:

+ proc = self.processors[ct]

+ except KeyError:

+ toptype = ct.split('/', 1)[0]

+ try:

+ proc = self.processors[toptype]

+ except KeyError:

+ pass

+ if proc is None:

+ self.default_proc()

+ else:

+ proc(self)

+ def default_proc(self):

+ """Called if a more-specific processor is not found for the ``Content-Type``."""

+ # Leave the fp alone for someone else to read. This works fine

+ # for request.body, but the Part subclasses need to override this

+ # so they can move on to the next part.

+ pass

+class Part(Entity):

+ """A MIME part entity, part of a multipart entity."""

+ # "The default character set, which must be assumed in the absence of a

+ # charset parameter, is US-ASCII."

+ attempt_charsets = ['us-ascii', 'utf-8']

+ """A list of strings, each of which should be a known encoding.

+ When the Content-Type of the request body warrants it, each of the given

+ encodings will be tried in order. The first one to successfully decode the

+ entity without raising an error is stored as

+ :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults

+ to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by

+ `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_),

+ but ``['us-ascii', 'utf-8']`` for multipart parts.

+ """

+ boundary = None

+ """The MIME multipart boundary."""

+ default_content_type = 'text/plain'

+ """This defines a default ``Content-Type`` to use if no Content-Type header

+ is given. The empty string is used for RequestBody, which results in the

+ request body not being read or parsed at all. This is by design; a missing

+ ``Content-Type`` header in the HTTP request entity is an error at best,

+ and a security hole at worst. For multipart parts, however (this class),

+ the MIME spec declares that a part with no Content-Type defaults to

+ "text/plain".

+ """

+ # This is the default in stdlib cgi. We may want to increase it.

+ maxrambytes = 1000

+ """The threshold of bytes after which point the ``Part`` will store its data

+ in a file (generated by :func:`make_file<cherrypy._cprequest.Entity.make_file>`)

+ instead of a string. Defaults to 1000, just like the :mod:`cgi` module in

+ Python's standard library.

+ """

+ def __init__(self, fp, headers, boundary):

+ Entity.__init__(self, fp, headers)

+ self.boundary = boundary

+ self.file = None

+ self.value = None

+ def from_fp(cls, fp, boundary):

+ headers = cls.read_headers(fp)

+ return cls(fp, headers, boundary)

+ from_fp = classmethod(from_fp)

+ def read_headers(cls, fp):

+ headers = httputil.HeaderMap()

+ while True:

+ line = fp.readline()

+ if not line:

+ # No more data--illegal end of headers

+ raise EOFError("Illegal end of headers.")

+ if line == ntob('\r\n'):

+ # Normal end of headers

+ break

+ if not line.endswith(ntob('\r\n')):

+ raise ValueError("MIME requires CRLF terminators: %r" % line)

+ if line[0] in ntob(' \t'):

+ # It's a continuation line.

+ v = line.strip().decode('ISO-8859-1')

+ else:

+ k, v = line.split(ntob(":"), 1)

+ k = k.strip().decode('ISO-8859-1')

+ v = v.strip().decode('ISO-8859-1')

+ existing = headers.get(k)

+ if existing:

+ v = ", ".join((existing, v))

+ headers[k] = v

+ return headers

+ read_headers = classmethod(read_headers)

+ def read_lines_to_boundary(self, fp_out=None):

+ """Read bytes from self.fp and return or write them to a file.

+ If the 'fp_out' argument is None (the default), all bytes read are

+ returned in a single byte string.

+ If the 'fp_out' argument is not None, it must be a file-like object that

+ supports the 'write' method; all bytes read will be written to the fp,

+ and that fp is returned.

+ """

+ endmarker = self.boundary + ntob("--")

+ delim = ntob("")

+ prev_lf = True

+ lines = []

+ seen = 0

+ while True:

+ line = self.fp.readline(1<<16)

+ if not line:

+ raise EOFError("Illegal end of multipart body.")

+ if line.startswith(ntob("--")) and prev_lf:

+ strippedline = line.strip()

+ if strippedline == self.boundary:

+ break

+ if strippedline == endmarker:

+ self.fp.finish()

+ break

+ line = delim + line

+ if line.endswith(ntob("\r\n")):

+ delim = ntob("\r\n")

+ line = line[:-2]

+ prev_lf = True

+ elif line.endswith(ntob("\n")):

+ delim = ntob("\n")

+ line = line[:-1]

+ prev_lf = True

+ else:

+ delim = ntob("")

+ prev_lf = False

+ if fp_out is None:

+ lines.append(line)

+ seen += len(line)

+ if seen > self.maxrambytes:

+ fp_out = self.make_file()

+ for line in lines:

+ fp_out.write(line)

+ else:

+ fp_out.write(line)

+ if fp_out is None:

+ result = ntob('').join(lines)

+ for charset in self.attempt_charsets:

+ try:

+ result = result.decode(charset)

+ except UnicodeDecodeError:

+ pass

+ else:

+ self.charset = charset

+ return result

+ else:

+ raise cherrypy.HTTPError(

+ 400, "The request entity could not be decoded. The following "

+ "charsets were attempted: %s" % repr(self.attempt_charsets))

+ else:

+ fp_out.seek(0)

+ return fp_out

+ def default_proc(self):

+ """Called if a more-specific processor is not found for the ``Content-Type``."""

+ if self.filename:

+ # Always read into a file if a .filename was given.

+ self.file = self.read_into_file()

+ else:

+ result = self.read_lines_to_boundary()

+ if isinstance(result, basestring):

+ self.value = result

+ else:

+ self.file = result

+ def read_into_file(self, fp_out=None):

+ """Read the request body into fp_out (or make_file() if None). Return fp_out."""

+ if fp_out is None:

+ fp_out = self.make_file()

+ self.read_lines_to_boundary(fp_out=fp_out)

+ return fp_out

+Entity.part_class = Part

+try:

+ inf = float('inf')

+except ValueError:

+ # Python 2.4 and lower

+ class Infinity(object):

+ def __cmp__(self, other):

+ return 1

+ def __sub__(self, other):

+ return self

+ inf = Infinity()

+comma_separated_headers = ['Accept', 'Accept-Charset', 'Accept-Encoding',

+ 'Accept-Language', 'Accept-Ranges', 'Allow', 'Cache-Control', 'Connection',

+ 'Content-Encoding', 'Content-Language', 'Expect', 'If-Match',

+ 'If-None-Match', 'Pragma', 'Proxy-Authenticate', 'Te', 'Trailer',

+ 'Transfer-Encoding', 'Upgrade', 'Vary', 'Via', 'Warning', 'Www-Authenticate']

+class SizedReader:

+ def __init__(self, fp, length, maxbytes, bufsize=DEFAULT_BUFFER_SIZE, has_trailers=False):

+ # Wrap our fp in a buffer so peek() works

+ self.fp = fp

+ self.length = length

+ self.maxbytes = maxbytes

+ self.buffer = ntob('')

+ self.bufsize = bufsize

+ self.bytes_read = 0

+ self.done = False

+ self.has_trailers = has_trailers

+ def read(self, size=None, fp_out=None):

+ """Read bytes from the request body and return or write them to a file.

+ A number of bytes less than or equal to the 'size' argument are read

+ off the socket. The actual number of bytes read are tracked in

+ self.bytes_read. The number may be smaller than 'size' when 1) the

+ client sends fewer bytes, 2) the 'Content-Length' request header

+ specifies fewer bytes than requested, or 3) the number of bytes read

+ exceeds self.maxbytes (in which case, 413 is raised).

+ If the 'fp_out' argument is None (the default), all bytes read are

+ returned in a single byte string.

+ If the 'fp_out' argument is not None, it must be a file-like object that

+ supports the 'write' method; all bytes read will be written to the fp,

+ and None is returned.

+ """

+ if self.length is None:

+ if size is None:

+ remaining = inf

+ else:

+ remaining = size

+ else:

+ remaining = self.length - self.bytes_read

+ if size and size < remaining:

+ remaining = size

+ if remaining == 0:

+ self.finish()

+ if fp_out is None:

+ return ntob('')

+ else:

+ return None

+ chunks = []

+ # Read bytes from the buffer.

+ if self.buffer:

+ if remaining is inf:

+ data = self.buffer

+ self.buffer = ntob('')

+ else:

+ data = self.buffer[:remaining]

+ self.buffer = self.buffer[remaining:]

+ datalen = len(data)

+ remaining -= datalen

+ # Check lengths.

+ self.bytes_read += datalen

+ if self.maxbytes and self.bytes_read > self.maxbytes:

+ raise cherrypy.HTTPError(413)

+ # Store the data.

+ if fp_out is None:

+ chunks.append(data)

+ else:

+ fp_out.write(data)

+ # Read bytes from the socket.

+ while remaining > 0:

+ chunksize = min(remaining, self.bufsize)

+ try:

+ data = self.fp.read(chunksize)

+ except Exception:

+ e = sys.exc_info()[1]

+ if e.__class__.__name__ == 'MaxSizeExceeded':

+ # Post data is too big

+ raise cherrypy.HTTPError(

+ 413, "Maximum request length: %r" % e.args[1])

+ else:

+ raise

+ if not data:

+ self.finish()

+ break

+ datalen = len(data)

+ remaining -= datalen

+ # Check lengths.

+ self.bytes_read += datalen

+ if self.maxbytes and self.bytes_read > self.maxbytes:

+ raise cherrypy.HTTPError(413)

+ # Store the data.

+ if fp_out is None:

+ chunks.append(data)

+ else:

+ fp_out.write(data)

+ if fp_out is None:

+ return ntob('').join(chunks)

+ def readline(self, size=None):

+ """Read a line from the request body and return it."""

+ chunks = []

+ while size is None or size > 0:

+ chunksize = self.bufsize

+ if size is not None and size < self.bufsize:

+ chunksize = size

+ data = self.read(chunksize)

+ if not data:

+ break

+ pos = data.find(ntob('\n')) + 1

+ if pos:

+ chunks.append(data[:pos])

+ remainder = data[pos:]

+ self.buffer += remainder

+ self.bytes_read -= len(remainder)

+ break

+ else:

+ chunks.append(data)

+ return ntob('').join(chunks)

+ def readlines(self, sizehint=None):

+ """Read lines from the request body and return them."""

+ if self.length is not None:

+ if sizehint is None:

+ sizehint = self.length - self.bytes_read

+ else:

+ sizehint = min(sizehint, self.length - self.bytes_read)

+ lines = []

+ seen = 0

+ while True:

+ line = self.readline()

+ if not line:

+ break

+ lines.append(line)

+ seen += len(line)

+ if seen >= sizehint:

+ break

+ return lines

+ def finish(self):

+ self.done = True

+ if self.has_trailers and hasattr(self.fp, 'read_trailer_lines'):

+ self.trailers = {}

+ try:

+ for line in self.fp.read_trailer_lines():

+ if line[0] in ntob(' \t'):

+ # It's a continuation line.

+ v = line.strip()

+ else:

+ try:

+ k, v = line.split(ntob(":"), 1)

+ except ValueError:

+ raise ValueError("Illegal header line.")

+ k = k.strip().title()

+ v = v.strip()

+ if k in comma_separated_headers:

+ existing = self.trailers.get(envname)

+ if existing:

+ v = ntob(", ").join((existing, v))

+ self.trailers[k] = v

+ except Exception:

+ e = sys.exc_info()[1]

+ if e.__class__.__name__ == 'MaxSizeExceeded':

+ # Post data is too big

+ raise cherrypy.HTTPError(

+ 413, "Maximum request length: %r" % e.args[1])

+ else:

+ raise

+class RequestBody(Entity):

+ """The entity of the HTTP request."""

+ bufsize = 8 * 1024

+ """The buffer size used when reading the socket."""

+ # Don't parse the request body at all if the client didn't provide

+ # a Content-Type header. See http://www.cherrypy.org/ticket/790

+ default_content_type = ''

+ """This defines a default ``Content-Type`` to use if no Content-Type header

+ is given. The empty string is used for RequestBody, which results in the

+ request body not being read or parsed at all. This is by design; a missing

+ ``Content-Type`` header in the HTTP request entity is an error at best,

+ and a security hole at worst. For multipart parts, however, the MIME spec

+ declares that a part with no Content-Type defaults to "text/plain"

+ (see :class:`Part<cherrypy._cpreqbody.Part>`).

+ """

+ maxbytes = None

+ """Raise ``MaxSizeExceeded`` if more bytes than this are read from the socket."""

+ def __init__(self, fp, headers, params=None, request_params=None):

+ Entity.__init__(self, fp, headers, params)

+ # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1

+ # When no explicit charset parameter is provided by the

+ # sender, media subtypes of the "text" type are defined

+ # to have a default charset value of "ISO-8859-1" when

+ # received via HTTP.

+ if self.content_type.value.startswith('text/'):

+ for c in ('ISO-8859-1', 'iso-8859-1', 'Latin-1', 'latin-1'):

+ if c in self.attempt_charsets:

+ break

+ else:

+ self.attempt_charsets.append('ISO-8859-1')

+ # Temporary fix while deprecating passing .parts as .params.

+ self.processors['multipart'] = _old_process_multipart

+ if request_params is None:

+ request_params = {}

+ self.request_params = request_params

+ def process(self):

+ """Process the request entity based on its Content-Type."""

+ # "The presence of a message-body in a request is signaled by the

+ # inclusion of a Content-Length or Transfer-Encoding header field in

+ # the request's message-headers."

+ # It is possible to send a POST request with no body, for example;

+ # however, app developers are responsible in that case to set

+ # cherrypy.request.process_body to False so this method isn't called.

+ h = cherrypy.serving.request.headers

+ if 'Content-Length' not in h and 'Transfer-Encoding' not in h:

+ raise cherrypy.HTTPError(411)

+ self.fp = SizedReader(self.fp, self.length,

+ self.maxbytes, bufsize=self.bufsize,

+ has_trailers='Trailer' in h)

+ super(RequestBody, self).process()

+ # Body params should also be a part of the request_params

+ # add them in here.

+ request_params = self.request_params

+ for key, value in self.params.items():

+ # Python 2 only: keyword arguments must be byte strings (type 'str').

+ if sys.version_info < (3, 0):

+ if isinstance(key, unicode):

+ key = key.encode('ISO-8859-1')

+ if key in request_params:

+ if not isinstance(request_params[key], list):

+ request_params[key] = [request_params[key]]

+ request_params[key].append(value)

+ else:

+ request_params[key] = value

Property changes on: third_party/cherrypy/_cpreqbody.py

___________________________________________________________________

Added: svn:eol-style

+ LF

« no previous file with comments | « third_party/cherrypy/_cpnative_server.py ('k') | third_party/cherrypy/_cprequest.py » ('j') | no next file with comments »