third_party/cherrypy/_cpreqbody.py - Issue 9368042: Add CherryPy to third_party.

Side by Side Diff: third_party/cherrypy/_cpreqbody.py

Issue 9368042: Add CherryPy to third_party. (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/tools/build/

Patch Set: '' Created 8 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

Property Changes:

Added: svn:eol-style
+ LF

OLD	NEW
(Empty)
	1 """Request body processing for CherryPy.

	2

	3 .. versionadded:: 3.2

	4

	5 Application authors have complete control over the parsing of HTTP request

	6 entities. In short, :attr:`cherrypy.request.body<cherrypy._cprequest.Request.bod y>`

	7 is now always set to an instance of :class:`RequestBody<cherrypy._cpreqbody.Requ estBody>`,

	8 and that class is a subclass of :class:`Entity<cherrypy._cpreqbody.Entity>`.

	9

	10 When an HTTP request includes an entity body, it is often desirable to

	11 provide that information to applications in a form other than the raw bytes.

	12 Different content types demand different approaches. Examples:

	13

	14 * For a GIF file, we want the raw bytes in a stream.

	15 * An HTML form is better parsed into its component fields, and each text field

	16 decoded from bytes to unicode.

	17 * A JSON body should be deserialized into a Python dict or list.

	18

	19 When the request contains a Content-Type header, the media type is used as a

	20 key to look up a value in the

	21 :attr:`request.body.processors<cherrypy._cpreqbody.Entity.processors>` dict.

	22 If the full media

	23 type is not found, then the major type is tried; for example, if no processor

	24 is found for the 'image/jpeg' type, then we look for a processor for the 'image'

	25 types altogether. If neither the full type nor the major type has a matching

	26 processor, then a default processor is used

	27 (:func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>`). For most

	28 types, this means no processing is done, and the body is left unread as a

	29 raw byte stream. Processors are configurable in an 'on_start_resource' hook.

	30

	31 Some processors, especially those for the 'text' types, attempt to decode bytes

	32 to unicode. If the Content-Type request header includes a 'charset' parameter,

	33 this is used to decode the entity. Otherwise, one or more default charsets may

	34 be attempted, although this decision is up to each processor. If a processor

	35 successfully decodes an Entity or Part, it should set the

	36 :attr:`charset<cherrypy._cpreqbody.Entity.charset>` attribute

	37 on the Entity or Part to the name of the successful charset, so that

	38 applications can easily re-encode or transcode the value if they wish.

	39

	40 If the Content-Type of the request entity is of major type 'multipart', then

	41 the above parsing process, and possibly a decoding process, is performed for

	42 each part.

	43

	44 For both the full entity and multipart parts, a Content-Disposition header may

	45 be used to fill :attr:`name<cherrypy._cpreqbody.Entity.name>` and

	46 :attr:`filename<cherrypy._cpreqbody.Entity.filename>` attributes on the

	47 request.body or the Part.

	48

	49 .. _custombodyprocessors:

	50

	51 Custom Processors

	52 =================

	53

	54 You can add your own processors for any specific or major MIME type. Simply add

	55 it to the :attr:`processors<cherrypy._cprequest.Entity.processors>` dict in a

	56 hook/tool that runs at ``on_start_resource`` or ``before_request_body``.

	57 Here's the built-in JSON tool for an example::

	58

	59 def json_in(force=True, debug=False):

	60 request = cherrypy.serving.request

	61 def json_processor(entity):

	62 \"""Read application/json data into request.json.\"""

	63 if not entity.headers.get("Content-Length", ""):

	64 raise cherrypy.HTTPError(411)

	65

	66 body = entity.fp.read()

	67 try:

	68 request.json = json_decode(body)

	69 except ValueError:

	70 raise cherrypy.HTTPError(400, 'Invalid JSON document')

	71 if force:

	72 request.body.processors.clear()

	73 request.body.default_proc = cherrypy.HTTPError(

	74 415, 'Expected an application/json content type')

	75 request.body.processors['application/json'] = json_processor

	76

	77 We begin by defining a new ``json_processor`` function to stick in the ``process ors``

	78 dictionary. All processor functions take a single argument, the ``Entity`` insta nce

	79 they are to process. It will be called whenever a request is received (for those

	80 URI's where the tool is turned on) which has a ``Content-Type`` of

	81 "application/json".

	82

	83 First, it checks for a valid ``Content-Length`` (raising 411 if not valid), then

	84 reads the remaining bytes on the socket. The ``fp`` object knows its own length, so

	85 it won't hang waiting for data that never arrives. It will return when all data

	86 has been read. Then, we decode those bytes using Python's built-in ``json`` modu le,

	87 and stick the decoded result onto ``request.json`` . If it cannot be decoded, we

	88 raise 400.

	89

	90 If the "force" argument is True (the default), the ``Tool`` clears the ``process ors``

	91 dict so that request entities of other ``Content-Types`` aren't parsed at all. S ince

	92 there's no entry for those invalid MIME types, the ``default_proc`` method of `` cherrypy.request.body``

	93 is called. But this does nothing by default (usually to provide the page handler an opportunity to handle it.)

	94 But in our case, we want to raise 415, so we replace ``request.body.default_proc ``

	95 with the error (``HTTPError`` instances, when called, raise themselves).

	96

	97 If we were defining a custom processor, we can do so without making a ``Tool``. Just add the config entry::

	98

	99 request.body.processors = {'application/json': json_processor}

	100

	101 Note that you can only replace the ``processors`` dict wholesale this way, not u pdate the existing one.

	102 """

	103

	104 try:

	105 from io import DEFAULT_BUFFER_SIZE

	106 except ImportError:

	107 DEFAULT_BUFFER_SIZE = 8192

	108 import re

	109 import sys

	110 import tempfile

	111 try:

	112 from urllib import unquote_plus

	113 except ImportError:

	114 def unquote_plus(bs):

	115 """Bytes version of urllib.parse.unquote_plus."""

	116 bs = bs.replace(ntob('+'), ntob(' '))

	117 atoms = bs.split(ntob('%'))

	118 for i in range(1, len(atoms)):

	119 item = atoms[i]

	120 try:

	121 pct = int(item[:2], 16)

	122 atoms[i] = bytes([pct]) + item[2:]

	123 except ValueError:

	124 pass

	125 return ntob('').join(atoms)

	126

	127 import cherrypy

	128 from cherrypy._cpcompat import basestring, ntob, ntou

	129 from cherrypy.lib import httputil

	130

	131

	132 # -------------------------------- Processors -------------------------------- #

	133

	134 def process_urlencoded(entity):

	135 """Read application/x-www-form-urlencoded data into entity.params."""

	136 qs = entity.fp.read()

	137 for charset in entity.attempt_charsets:

	138 try:

	139 params = {}

	140 for aparam in qs.split(ntob('&')):

	141 for pair in aparam.split(ntob(';')):

	142 if not pair:

	143 continue

	144

	145 atoms = pair.split(ntob('='), 1)

	146 if len(atoms) == 1:

	147 atoms.append(ntob(''))

	148

	149 key = unquote_plus(atoms[0]).decode(charset)

	150 value = unquote_plus(atoms[1]).decode(charset)

	151

	152 if key in params:

	153 if not isinstance(params[key], list):

	154 params[key] = [params[key]]

	155 params[key].append(value)

	156 else:

	157 params[key] = value

	158 except UnicodeDecodeError:

	159 pass

	160 else:

	161 entity.charset = charset

	162 break

	163 else:

	164 raise cherrypy.HTTPError(

	165 400, "The request entity could not be decoded. The following "

	166 "charsets were attempted: %s" % repr(entity.attempt_charsets))

	167

	168 # Now that all values have been successfully parsed and decoded,

	169 # apply them to the entity.params dict.

	170 for key, value in params.items():

	171 if key in entity.params:

	172 if not isinstance(entity.params[key], list):

	173 entity.params[key] = [entity.params[key]]

	174 entity.params[key].append(value)

	175 else:

	176 entity.params[key] = value

	177

	178

	179 def process_multipart(entity):

	180 """Read all multipart parts into entity.parts."""

	181 ib = ""

	182 if 'boundary' in entity.content_type.params:

	183 # http://tools.ietf.org/html/rfc2046#section-5.1.1

	184 # "The grammar for parameters on the Content-type field is such that it

	185 # is often necessary to enclose the boundary parameter values in quotes

	186 # on the Content-type line"

	187 ib = entity.content_type.params['boundary'].strip('"')

	188

	189 if not re.match("^[ -~]{0,200}[!-~]$", ib):

	190 raise ValueError('Invalid boundary in multipart form: %r' % (ib,))

	191

	192 ib = ('--' + ib).encode('ascii')

	193

	194 # Find the first marker

	195 while True:

	196 b = entity.readline()

	197 if not b:

	198 return

	199

	200 b = b.strip()

	201 if b == ib:

	202 break

	203

	204 # Read all parts

	205 while True:

	206 part = entity.part_class.from_fp(entity.fp, ib)

	207 entity.parts.append(part)

	208 part.process()

	209 if part.fp.done:

	210 break

	211

	212 def process_multipart_form_data(entity):

	213 """Read all multipart/form-data parts into entity.parts or entity.params."""

	214 process_multipart(entity)

	215

	216 kept_parts = []

	217 for part in entity.parts:

	218 if part.name is None:

	219 kept_parts.append(part)

	220 else:

	221 if part.filename is None:

	222 # It's a regular field

	223 value = part.fullvalue()

	224 else:

	225 # It's a file upload. Retain the whole part so consumer code

	226 # has access to its .file and .filename attributes.

	227 value = part

	228

	229 if part.name in entity.params:

	230 if not isinstance(entity.params[part.name], list):

	231 entity.params[part.name] = [entity.params[part.name]]

	232 entity.params[part.name].append(value)

	233 else:

	234 entity.params[part.name] = value

	235

	236 entity.parts = kept_parts

	237

	238 def _old_process_multipart(entity):

	239 """The behavior of 3.2 and lower. Deprecated and will be changed in 3.3."""

	240 process_multipart(entity)

	241

	242 params = entity.params

	243

	244 for part in entity.parts:

	245 if part.name is None:

	246 key = ntou('parts')

	247 else:

	248 key = part.name

	249

	250 if part.filename is None:

	251 # It's a regular field

	252 value = part.fullvalue()

	253 else:

	254 # It's a file upload. Retain the whole part so consumer code

	255 # has access to its .file and .filename attributes.

	256 value = part

	257

	258 if key in params:

	259 if not isinstance(params[key], list):

	260 params[key] = [params[key]]

	261 params[key].append(value)

	262 else:

	263 params[key] = value

	264

	265

	266

	267 # --------------------------------- Entities --------------------------------- #

	268

	269

	270 class Entity(object):

	271 """An HTTP request body, or MIME multipart body.

	272

	273 This class collects information about the HTTP request entity. When a

	274 given entity is of MIME type "multipart", each part is parsed into its own

	275 Entity instance, and the set of parts stored in

	276 :attr:`entity.parts<cherrypy._cpreqbody.Entity.parts>`.

	277

	278 Between the ``before_request_body`` and ``before_handler`` tools, CherryPy

	279 tries to process the request body (if any) by calling

	280 :func:`request.body.process<cherrypy._cpreqbody.RequestBody.process`.

	281 This uses the ``content_type`` of the Entity to look up a suitable processor

	282 in :attr:`Entity.processors<cherrypy._cpreqbody.Entity.processors>`, a dict.

	283 If a matching processor cannot be found for the complete Content-Type,

	284 it tries again using the major type. For example, if a request with an

	285 entity of type "image/jpeg" arrives, but no processor can be found for

	286 that complete type, then one is sought for the major type "image". If a

	287 processor is still not found, then the

	288 :func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>` method of the

	289 Entity is called (which does nothing by default; you can override this too).

	290

	291 CherryPy includes processors for the "application/x-www-form-urlencoded"

	292 type, the "multipart/form-data" type, and the "multipart" major type.

	293 CherryPy 3.2 processes these types almost exactly as older versions.

	294 Parts are passed as arguments to the page handler using their

	295 ``Content-Disposition.name`` if given, otherwise in a generic "parts"

	296 argument. Each such part is either a string, or the

	297 :class:`Part<cherrypy._cpreqbody.Part>` itself if it's a file. (In this

	298 case it will have ``file`` and ``filename`` attributes, or possibly a

	299 ``value`` attribute). Each Part is itself a subclass of

	300 Entity, and has its own ``process`` method and ``processors`` dict.

	301

	302 There is a separate processor for the "multipart" major type which is more

	303 flexible, and simply stores all multipart parts in

	304 :attr:`request.body.parts<cherrypy._cpreqbody.Entity.parts>`. You can

	305 enable it with::

	306

	307 cherrypy.request.body.processors['multipart'] = _cpreqbody.process_multi part

	308

	309 in an ``on_start_resource`` tool.

	310 """

	311

	312 # http://tools.ietf.org/html/rfc2046#section-4.1.2:

	313 # "The default character set, which must be assumed in the

	314 # absence of a charset parameter, is US-ASCII."

	315 # However, many browsers send data in utf-8 with no charset.

	316 attempt_charsets = ['utf-8']

	317 """A list of strings, each of which should be a known encoding.

	318

	319 When the Content-Type of the request body warrants it, each of the given

	320 encodings will be tried in order. The first one to successfully decode the

	321 entity without raising an error is stored as

	322 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults

	323 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by

	324 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_ ),

	325 but ``['us-ascii', 'utf-8']`` for multipart parts.

	326 """

	327

	328 charset = None

	329 """The successful decoding; see "attempt_charsets" above."""

	330

	331 content_type = None

	332 """The value of the Content-Type request header.

	333

	334 If the Entity is part of a multipart payload, this will be the Content-Type

	335 given in the MIME headers for this part.

	336 """

	337

	338 default_content_type = 'application/x-www-form-urlencoded'

	339 """This defines a default ``Content-Type`` to use if no Content-Type header

	340 is given. The empty string is used for RequestBody, which results in the

	341 request body not being read or parsed at all. This is by design; a missing

	342 ``Content-Type`` header in the HTTP request entity is an error at best,

	343 and a security hole at worst. For multipart parts, however, the MIME spec

	344 declares that a part with no Content-Type defaults to "text/plain"

	345 (see :class:`Part<cherrypy._cpreqbody.Part>`).

	346 """

	347

	348 filename = None

	349 """The ``Content-Disposition.filename`` header, if available."""

	350

	351 fp = None

	352 """The readable socket file object."""

	353

	354 headers = None

	355 """A dict of request/multipart header names and values.

	356

	357 This is a copy of the ``request.headers`` for the ``request.body``;

	358 for multipart parts, it is the set of headers for that part.

	359 """

	360

	361 length = None

	362 """The value of the ``Content-Length`` header, if provided."""

	363

	364 name = None

	365 """The "name" parameter of the ``Content-Disposition`` header, if any."""

	366

	367 params = None

	368 """

	369 If the request Content-Type is 'application/x-www-form-urlencoded' or

	370 multipart, this will be a dict of the params pulled from the entity

	371 body; that is, it will be the portion of request.params that come

	372 from the message body (sometimes called "POST params", although they

	373 can be sent with various HTTP method verbs). This value is set between

	374 the 'before_request_body' and 'before_handler' hooks (assuming that

	375 process_request_body is True)."""

	376

	377 processors = {'application/x-www-form-urlencoded': process_urlencoded,

	378 'multipart/form-data': process_multipart_form_data,

	379 'multipart': process_multipart,

	380 }

	381 """A dict of Content-Type names to processor methods."""

	382

	383 parts = None

	384 """A list of Part instances if ``Content-Type`` is of major type "multipart" ."""

	385

	386 part_class = None

	387 """The class used for multipart parts.

	388

	389 You can replace this with custom subclasses to alter the processing of

	390 multipart parts.

	391 """

	392

	393 def __init__(self, fp, headers, params=None, parts=None):

	394 # Make an instance-specific copy of the class processors

	395 # so Tools, etc. can replace them per-request.

	396 self.processors = self.processors.copy()

	397

	398 self.fp = fp

	399 self.headers = headers

	400

	401 if params is None:

	402 params = {}

	403 self.params = params

	404

	405 if parts is None:

	406 parts = []

	407 self.parts = parts

	408

	409 # Content-Type

	410 self.content_type = headers.elements('Content-Type')

	411 if self.content_type:

	412 self.content_type = self.content_type[0]

	413 else:

	414 self.content_type = httputil.HeaderElement.from_str(

	415 self.default_content_type)

	416

	417 # Copy the class 'attempt_charsets', prepending any Content-Type charset

	418 dec = self.content_type.params.get("charset", None)

	419 if dec:

	420 self.attempt_charsets = [dec] + [c for c in self.attempt_charsets

	421 if c != dec]

	422 else:

	423 self.attempt_charsets = self.attempt_charsets[:]

	424

	425 # Length

	426 self.length = None

	427 clen = headers.get('Content-Length', None)

	428 # If Transfer-Encoding is 'chunked', ignore any Content-Length.

	429 if clen is not None and 'chunked' not in headers.get('Transfer-Encoding' , ''):

	430 try:

	431 self.length = int(clen)

	432 except ValueError:

	433 pass

	434

	435 # Content-Disposition

	436 self.name = None

	437 self.filename = None

	438 disp = headers.elements('Content-Disposition')

	439 if disp:

	440 disp = disp[0]

	441 if 'name' in disp.params:

	442 self.name = disp.params['name']

	443 if self.name.startswith('"') and self.name.endswith('"'):

	444 self.name = self.name[1:-1]

	445 if 'filename' in disp.params:

	446 self.filename = disp.params['filename']

	447 if self.filename.startswith('"') and self.filename.endswith('"') :

	448 self.filename = self.filename[1:-1]

	449

	450 # The 'type' attribute is deprecated in 3.2; remove it in 3.3.

	451 type = property(lambda self: self.content_type,

	452 doc="""A deprecated alias for :attr:`content_type<cherrypy._cpreqbody.En tity.content_type>`.""")

	453

	454 def read(self, size=None, fp_out=None):

	455 return self.fp.read(size, fp_out)

	456

	457 def readline(self, size=None):

	458 return self.fp.readline(size)

	459

	460 def readlines(self, sizehint=None):

	461 return self.fp.readlines(sizehint)

	462

	463 def __iter__(self):

	464 return self

	465

	466 def __next__(self):

	467 line = self.readline()

	468 if not line:

	469 raise StopIteration

	470 return line

	471

	472 def next(self):

	473 return self.__next__()

	474

	475 def read_into_file(self, fp_out=None):

	476 """Read the request body into fp_out (or make_file() if None). Return fp _out."""

	477 if fp_out is None:

	478 fp_out = self.make_file()

	479 self.read(fp_out=fp_out)

	480 return fp_out

	481

	482 def make_file(self):

	483 """Return a file-like object into which the request body will be read.

	484

	485 By default, this will return a TemporaryFile. Override as needed.

	486 See also :attr:`cherrypy._cpreqbody.Part.maxrambytes`."""

	487 return tempfile.TemporaryFile()

	488

	489 def fullvalue(self):

	490 """Return this entity as a string, whether stored in a file or not."""

	491 if self.file:

	492 # It was stored in a tempfile. Read it.

	493 self.file.seek(0)

	494 value = self.file.read()

	495 self.file.seek(0)

	496 else:

	497 value = self.value

	498 return value

	499

	500 def process(self):

	501 """Execute the best-match processor for the given media type."""

	502 proc = None

	503 ct = self.content_type.value

	504 try:

	505 proc = self.processors[ct]

	506 except KeyError:

	507 toptype = ct.split('/', 1)[0]

	508 try:

	509 proc = self.processors[toptype]

	510 except KeyError:

	511 pass

	512 if proc is None:

	513 self.default_proc()

	514 else:

	515 proc(self)

	516

	517 def default_proc(self):

	518 """Called if a more-specific processor is not found for the ``Content-Ty pe``."""

	519 # Leave the fp alone for someone else to read. This works fine

	520 # for request.body, but the Part subclasses need to override this

	521 # so they can move on to the next part.

	522 pass

	523

	524

	525 class Part(Entity):

	526 """A MIME part entity, part of a multipart entity."""

	527

	528 # "The default character set, which must be assumed in the absence of a

	529 # charset parameter, is US-ASCII."

	530 attempt_charsets = ['us-ascii', 'utf-8']

	531 """A list of strings, each of which should be a known encoding.

	532

	533 When the Content-Type of the request body warrants it, each of the given

	534 encodings will be tried in order. The first one to successfully decode the

	535 entity without raising an error is stored as

	536 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults

	537 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by

	538 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_ ),

	539 but ``['us-ascii', 'utf-8']`` for multipart parts.

	540 """

	541

	542 boundary = None

	543 """The MIME multipart boundary."""

	544

	545 default_content_type = 'text/plain'

	546 """This defines a default ``Content-Type`` to use if no Content-Type header

	547 is given. The empty string is used for RequestBody, which results in the

	548 request body not being read or parsed at all. This is by design; a missing

	549 ``Content-Type`` header in the HTTP request entity is an error at best,

	550 and a security hole at worst. For multipart parts, however (this class),

	551 the MIME spec declares that a part with no Content-Type defaults to

	552 "text/plain".

	553 """

	554

	555 # This is the default in stdlib cgi. We may want to increase it.

	556 maxrambytes = 1000

	557 """The threshold of bytes after which point the ``Part`` will store its data

	558 in a file (generated by :func:`make_file<cherrypy._cprequest.Entity.make_fil e>`)

	559 instead of a string. Defaults to 1000, just like the :mod:`cgi` module in

	560 Python's standard library.

	561 """

	562

	563 def __init__(self, fp, headers, boundary):

	564 Entity.__init__(self, fp, headers)

	565 self.boundary = boundary

	566 self.file = None

	567 self.value = None

	568

	569 def from_fp(cls, fp, boundary):

	570 headers = cls.read_headers(fp)

	571 return cls(fp, headers, boundary)

	572 from_fp = classmethod(from_fp)

	573

	574 def read_headers(cls, fp):

	575 headers = httputil.HeaderMap()

	576 while True:

	577 line = fp.readline()

	578 if not line:

	579 # No more data--illegal end of headers

	580 raise EOFError("Illegal end of headers.")

	581

	582 if line == ntob('\r\n'):

	583 # Normal end of headers

	584 break

	585 if not line.endswith(ntob('\r\n')):

	586 raise ValueError("MIME requires CRLF terminators: %r" % line)

	587

	588 if line[0] in ntob(' \t'):

	589 # It's a continuation line.

	590 v = line.strip().decode('ISO-8859-1')

	591 else:

	592 k, v = line.split(ntob(":"), 1)

	593 k = k.strip().decode('ISO-8859-1')

	594 v = v.strip().decode('ISO-8859-1')

	595

	596 existing = headers.get(k)

	597 if existing:

	598 v = ", ".join((existing, v))

	599 headers[k] = v

	600

	601 return headers

	602 read_headers = classmethod(read_headers)

	603

	604 def read_lines_to_boundary(self, fp_out=None):

	605 """Read bytes from self.fp and return or write them to a file.

	606

	607 If the 'fp_out' argument is None (the default), all bytes read are

	608 returned in a single byte string.

	609

	610 If the 'fp_out' argument is not None, it must be a file-like object that

	611 supports the 'write' method; all bytes read will be written to the fp,

	612 and that fp is returned.

	613 """

	614 endmarker = self.boundary + ntob("--")

	615 delim = ntob("")

	616 prev_lf = True

	617 lines = []

	618 seen = 0

	619 while True:

	620 line = self.fp.readline(1<<16)

	621 if not line:

	622 raise EOFError("Illegal end of multipart body.")

	623 if line.startswith(ntob("--")) and prev_lf:

	624 strippedline = line.strip()

	625 if strippedline == self.boundary:

	626 break

	627 if strippedline == endmarker:

	628 self.fp.finish()

	629 break

	630

	631 line = delim + line

	632

	633 if line.endswith(ntob("\r\n")):

	634 delim = ntob("\r\n")

	635 line = line[:-2]

	636 prev_lf = True

	637 elif line.endswith(ntob("\n")):

	638 delim = ntob("\n")

	639 line = line[:-1]

	640 prev_lf = True

	641 else:

	642 delim = ntob("")

	643 prev_lf = False

	644

	645 if fp_out is None:

	646 lines.append(line)

	647 seen += len(line)

	648 if seen > self.maxrambytes:

	649 fp_out = self.make_file()

	650 for line in lines:

	651 fp_out.write(line)

	652 else:

	653 fp_out.write(line)

	654

	655 if fp_out is None:

	656 result = ntob('').join(lines)

	657 for charset in self.attempt_charsets:

	658 try:

	659 result = result.decode(charset)

	660 except UnicodeDecodeError:

	661 pass

	662 else:

	663 self.charset = charset

	664 return result

	665 else:

	666 raise cherrypy.HTTPError(

	667 400, "The request entity could not be decoded. The following "

	668 "charsets were attempted: %s" % repr(self.attempt_charsets))

	669 else:

	670 fp_out.seek(0)

	671 return fp_out

	672

	673 def default_proc(self):

	674 """Called if a more-specific processor is not found for the ``Content-Ty pe``."""

	675 if self.filename:

	676 # Always read into a file if a .filename was given.

	677 self.file = self.read_into_file()

	678 else:

	679 result = self.read_lines_to_boundary()

	680 if isinstance(result, basestring):

	681 self.value = result

	682 else:

	683 self.file = result

	684

	685 def read_into_file(self, fp_out=None):

	686 """Read the request body into fp_out (or make_file() if None). Return fp _out."""

	687 if fp_out is None:

	688 fp_out = self.make_file()

	689 self.read_lines_to_boundary(fp_out=fp_out)

	690 return fp_out

	691

	692 Entity.part_class = Part

	693

	694 try:

	695 inf = float('inf')

	696 except ValueError:

	697 # Python 2.4 and lower

	698 class Infinity(object):

	699 def __cmp__(self, other):

	700 return 1

	701 def __sub__(self, other):

	702 return self

	703 inf = Infinity()

	704

	705

	706 comma_separated_headers = ['Accept', 'Accept-Charset', 'Accept-Encoding',

	707 'Accept-Language', 'Accept-Ranges', 'Allow', 'Cache-Control', 'Connection',

	708 'Content-Encoding', 'Content-Language', 'Expect', 'If-Match',

	709 'If-None-Match', 'Pragma', 'Proxy-Authenticate', 'Te', 'Trailer',

	710 'Transfer-Encoding', 'Upgrade', 'Vary', 'Via', 'Warning', 'Www-Authenticate' ]

	711

	712

	713 class SizedReader:

	714

	715 def __init__(self, fp, length, maxbytes, bufsize=DEFAULT_BUFFER_SIZE, has_tr ailers=False):

	716 # Wrap our fp in a buffer so peek() works

	717 self.fp = fp

	718 self.length = length

	719 self.maxbytes = maxbytes

	720 self.buffer = ntob('')

	721 self.bufsize = bufsize

	722 self.bytes_read = 0

	723 self.done = False

	724 self.has_trailers = has_trailers

	725

	726 def read(self, size=None, fp_out=None):

	727 """Read bytes from the request body and return or write them to a file.

	728

	729 A number of bytes less than or equal to the 'size' argument are read

	730 off the socket. The actual number of bytes read are tracked in

	731 self.bytes_read. The number may be smaller than 'size' when 1) the

	732 client sends fewer bytes, 2) the 'Content-Length' request header

	733 specifies fewer bytes than requested, or 3) the number of bytes read

	734 exceeds self.maxbytes (in which case, 413 is raised).

	735

	736 If the 'fp_out' argument is None (the default), all bytes read are

	737 returned in a single byte string.

	738

	739 If the 'fp_out' argument is not None, it must be a file-like object that

	740 supports the 'write' method; all bytes read will be written to the fp,

	741 and None is returned.

	742 """

	743

	744 if self.length is None:

	745 if size is None:

	746 remaining = inf

	747 else:

	748 remaining = size

	749 else:

	750 remaining = self.length - self.bytes_read

	751 if size and size < remaining:

	752 remaining = size

	753 if remaining == 0:

	754 self.finish()

	755 if fp_out is None:

	756 return ntob('')

	757 else:

	758 return None

	759

	760 chunks = []

	761

	762 # Read bytes from the buffer.

	763 if self.buffer:

	764 if remaining is inf:

	765 data = self.buffer

	766 self.buffer = ntob('')

	767 else:

	768 data = self.buffer[:remaining]

	769 self.buffer = self.buffer[remaining:]

	770 datalen = len(data)

	771 remaining -= datalen

	772

	773 # Check lengths.

	774 self.bytes_read += datalen

	775 if self.maxbytes and self.bytes_read > self.maxbytes:

	776 raise cherrypy.HTTPError(413)

	777

	778 # Store the data.

	779 if fp_out is None:

	780 chunks.append(data)

	781 else:

	782 fp_out.write(data)

	783

	784 # Read bytes from the socket.

	785 while remaining > 0:

	786 chunksize = min(remaining, self.bufsize)

	787 try:

	788 data = self.fp.read(chunksize)

	789 except Exception:

	790 e = sys.exc_info()[1]

	791 if e.__class__.__name__ == 'MaxSizeExceeded':

	792 # Post data is too big

	793 raise cherrypy.HTTPError(

	794 413, "Maximum request length: %r" % e.args[1])

	795 else:

	796 raise

	797 if not data:

	798 self.finish()

	799 break

	800 datalen = len(data)

	801 remaining -= datalen

	802

	803 # Check lengths.

	804 self.bytes_read += datalen

	805 if self.maxbytes and self.bytes_read > self.maxbytes:

	806 raise cherrypy.HTTPError(413)

	807

	808 # Store the data.

	809 if fp_out is None:

	810 chunks.append(data)

	811 else:

	812 fp_out.write(data)

	813

	814 if fp_out is None:

	815 return ntob('').join(chunks)

	816

	817 def readline(self, size=None):

	818 """Read a line from the request body and return it."""

	819 chunks = []

	820 while size is None or size > 0:

	821 chunksize = self.bufsize

	822 if size is not None and size < self.bufsize:

	823 chunksize = size

	824 data = self.read(chunksize)

	825 if not data:

	826 break

	827 pos = data.find(ntob('\n')) + 1

	828 if pos:

	829 chunks.append(data[:pos])

	830 remainder = data[pos:]

	831 self.buffer += remainder

	832 self.bytes_read -= len(remainder)

	833 break

	834 else:

	835 chunks.append(data)

	836 return ntob('').join(chunks)

	837

	838 def readlines(self, sizehint=None):

	839 """Read lines from the request body and return them."""

	840 if self.length is not None:

	841 if sizehint is None:

	842 sizehint = self.length - self.bytes_read

	843 else:

	844 sizehint = min(sizehint, self.length - self.bytes_read)

	845

	846 lines = []

	847 seen = 0

	848 while True:

	849 line = self.readline()

	850 if not line:

	851 break

	852 lines.append(line)

	853 seen += len(line)

	854 if seen >= sizehint:

	855 break

	856 return lines

	857

	858 def finish(self):

	859 self.done = True

	860 if self.has_trailers and hasattr(self.fp, 'read_trailer_lines'):

	861 self.trailers = {}

	862

	863 try:

	864 for line in self.fp.read_trailer_lines():

	865 if line[0] in ntob(' \t'):

	866 # It's a continuation line.

	867 v = line.strip()

	868 else:

	869 try:

	870 k, v = line.split(ntob(":"), 1)

	871 except ValueError:

	872 raise ValueError("Illegal header line.")

	873 k = k.strip().title()

	874 v = v.strip()

	875

	876 if k in comma_separated_headers:

	877 existing = self.trailers.get(envname)

	878 if existing:

	879 v = ntob(", ").join((existing, v))

	880 self.trailers[k] = v

	881 except Exception:

	882 e = sys.exc_info()[1]

	883 if e.__class__.__name__ == 'MaxSizeExceeded':

	884 # Post data is too big

	885 raise cherrypy.HTTPError(

	886 413, "Maximum request length: %r" % e.args[1])

	887 else:

	888 raise

	889

	890

	891 class RequestBody(Entity):

	892 """The entity of the HTTP request."""

	893

	894 bufsize = 8 * 1024

	895 """The buffer size used when reading the socket."""

	896

	897 # Don't parse the request body at all if the client didn't provide

	898 # a Content-Type header. See http://www.cherrypy.org/ticket/790

	899 default_content_type = ''

	900 """This defines a default ``Content-Type`` to use if no Content-Type header

	901 is given. The empty string is used for RequestBody, which results in the

	902 request body not being read or parsed at all. This is by design; a missing

	903 ``Content-Type`` header in the HTTP request entity is an error at best,

	904 and a security hole at worst. For multipart parts, however, the MIME spec

	905 declares that a part with no Content-Type defaults to "text/plain"

	906 (see :class:`Part<cherrypy._cpreqbody.Part>`).

	907 """

	908

	909 maxbytes = None

	910 """Raise ``MaxSizeExceeded`` if more bytes than this are read from the socke t."""

	911

	912 def __init__(self, fp, headers, params=None, request_params=None):

	913 Entity.__init__(self, fp, headers, params)

	914

	915 # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1

	916 # When no explicit charset parameter is provided by the

	917 # sender, media subtypes of the "text" type are defined

	918 # to have a default charset value of "ISO-8859-1" when

	919 # received via HTTP.

	920 if self.content_type.value.startswith('text/'):

	921 for c in ('ISO-8859-1', 'iso-8859-1', 'Latin-1', 'latin-1'):

	922 if c in self.attempt_charsets:

	923 break

	924 else:

	925 self.attempt_charsets.append('ISO-8859-1')

	926

	927 # Temporary fix while deprecating passing .parts as .params.

	928 self.processors['multipart'] = _old_process_multipart

	929

	930 if request_params is None:

	931 request_params = {}

	932 self.request_params = request_params

	933

	934 def process(self):

	935 """Process the request entity based on its Content-Type."""

	936 # "The presence of a message-body in a request is signaled by the

	937 # inclusion of a Content-Length or Transfer-Encoding header field in

	938 # the request's message-headers."

	939 # It is possible to send a POST request with no body, for example;

	940 # however, app developers are responsible in that case to set

	941 # cherrypy.request.process_body to False so this method isn't called.

	942 h = cherrypy.serving.request.headers

	943 if 'Content-Length' not in h and 'Transfer-Encoding' not in h:

	944 raise cherrypy.HTTPError(411)

	945

	946 self.fp = SizedReader(self.fp, self.length,

	947 self.maxbytes, bufsize=self.bufsize,

	948 has_trailers='Trailer' in h)

	949 super(RequestBody, self).process()

	950

	951 # Body params should also be a part of the request_params

	952 # add them in here.

	953 request_params = self.request_params

	954 for key, value in self.params.items():

	955 # Python 2 only: keyword arguments must be byte strings (type 'str') .

	956 if sys.version_info < (3, 0):

	957 if isinstance(key, unicode):

	958 key = key.encode('ISO-8859-1')

	959

	960 if key in request_params:

	961 if not isinstance(request_params[key], list):

	962 request_params[key] = [request_params[key]]

	963 request_params[key].append(value)

	964 else:

	965 request_params[key] = value

OLD	NEW

« no previous file with comments | « third_party/cherrypy/_cpnative_server.py ('k') | third_party/cherrypy/_cprequest.py » ('j') | no next file with comments »