Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(131)

Side by Side Diff: third_party/cherrypy/_cpreqbody.py

Issue 9368042: Add CherryPy to third_party. (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/tools/build/
Patch Set: '' Created 8 years, 10 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
« no previous file with comments | « third_party/cherrypy/_cpnative_server.py ('k') | third_party/cherrypy/_cprequest.py » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Property Changes:
Added: svn:eol-style
+ LF
OLDNEW
(Empty)
1 """Request body processing for CherryPy.
2
3 .. versionadded:: 3.2
4
5 Application authors have complete control over the parsing of HTTP request
6 entities. In short, :attr:`cherrypy.request.body<cherrypy._cprequest.Request.bod y>`
7 is now always set to an instance of :class:`RequestBody<cherrypy._cpreqbody.Requ estBody>`,
8 and *that* class is a subclass of :class:`Entity<cherrypy._cpreqbody.Entity>`.
9
10 When an HTTP request includes an entity body, it is often desirable to
11 provide that information to applications in a form other than the raw bytes.
12 Different content types demand different approaches. Examples:
13
14 * For a GIF file, we want the raw bytes in a stream.
15 * An HTML form is better parsed into its component fields, and each text field
16 decoded from bytes to unicode.
17 * A JSON body should be deserialized into a Python dict or list.
18
19 When the request contains a Content-Type header, the media type is used as a
20 key to look up a value in the
21 :attr:`request.body.processors<cherrypy._cpreqbody.Entity.processors>` dict.
22 If the full media
23 type is not found, then the major type is tried; for example, if no processor
24 is found for the 'image/jpeg' type, then we look for a processor for the 'image'
25 types altogether. If neither the full type nor the major type has a matching
26 processor, then a default processor is used
27 (:func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>`). For most
28 types, this means no processing is done, and the body is left unread as a
29 raw byte stream. Processors are configurable in an 'on_start_resource' hook.
30
31 Some processors, especially those for the 'text' types, attempt to decode bytes
32 to unicode. If the Content-Type request header includes a 'charset' parameter,
33 this is used to decode the entity. Otherwise, one or more default charsets may
34 be attempted, although this decision is up to each processor. If a processor
35 successfully decodes an Entity or Part, it should set the
36 :attr:`charset<cherrypy._cpreqbody.Entity.charset>` attribute
37 on the Entity or Part to the name of the successful charset, so that
38 applications can easily re-encode or transcode the value if they wish.
39
40 If the Content-Type of the request entity is of major type 'multipart', then
41 the above parsing process, and possibly a decoding process, is performed for
42 each part.
43
44 For both the full entity and multipart parts, a Content-Disposition header may
45 be used to fill :attr:`name<cherrypy._cpreqbody.Entity.name>` and
46 :attr:`filename<cherrypy._cpreqbody.Entity.filename>` attributes on the
47 request.body or the Part.
48
49 .. _custombodyprocessors:
50
51 Custom Processors
52 =================
53
54 You can add your own processors for any specific or major MIME type. Simply add
55 it to the :attr:`processors<cherrypy._cprequest.Entity.processors>` dict in a
56 hook/tool that runs at ``on_start_resource`` or ``before_request_body``.
57 Here's the built-in JSON tool for an example::
58
59 def json_in(force=True, debug=False):
60 request = cherrypy.serving.request
61 def json_processor(entity):
62 \"""Read application/json data into request.json.\"""
63 if not entity.headers.get("Content-Length", ""):
64 raise cherrypy.HTTPError(411)
65
66 body = entity.fp.read()
67 try:
68 request.json = json_decode(body)
69 except ValueError:
70 raise cherrypy.HTTPError(400, 'Invalid JSON document')
71 if force:
72 request.body.processors.clear()
73 request.body.default_proc = cherrypy.HTTPError(
74 415, 'Expected an application/json content type')
75 request.body.processors['application/json'] = json_processor
76
77 We begin by defining a new ``json_processor`` function to stick in the ``process ors``
78 dictionary. All processor functions take a single argument, the ``Entity`` insta nce
79 they are to process. It will be called whenever a request is received (for those
80 URI's where the tool is turned on) which has a ``Content-Type`` of
81 "application/json".
82
83 First, it checks for a valid ``Content-Length`` (raising 411 if not valid), then
84 reads the remaining bytes on the socket. The ``fp`` object knows its own length, so
85 it won't hang waiting for data that never arrives. It will return when all data
86 has been read. Then, we decode those bytes using Python's built-in ``json`` modu le,
87 and stick the decoded result onto ``request.json`` . If it cannot be decoded, we
88 raise 400.
89
90 If the "force" argument is True (the default), the ``Tool`` clears the ``process ors``
91 dict so that request entities of other ``Content-Types`` aren't parsed at all. S ince
92 there's no entry for those invalid MIME types, the ``default_proc`` method of `` cherrypy.request.body``
93 is called. But this does nothing by default (usually to provide the page handler an opportunity to handle it.)
94 But in our case, we want to raise 415, so we replace ``request.body.default_proc ``
95 with the error (``HTTPError`` instances, when called, raise themselves).
96
97 If we were defining a custom processor, we can do so without making a ``Tool``. Just add the config entry::
98
99 request.body.processors = {'application/json': json_processor}
100
101 Note that you can only replace the ``processors`` dict wholesale this way, not u pdate the existing one.
102 """
103
104 try:
105 from io import DEFAULT_BUFFER_SIZE
106 except ImportError:
107 DEFAULT_BUFFER_SIZE = 8192
108 import re
109 import sys
110 import tempfile
111 try:
112 from urllib import unquote_plus
113 except ImportError:
114 def unquote_plus(bs):
115 """Bytes version of urllib.parse.unquote_plus."""
116 bs = bs.replace(ntob('+'), ntob(' '))
117 atoms = bs.split(ntob('%'))
118 for i in range(1, len(atoms)):
119 item = atoms[i]
120 try:
121 pct = int(item[:2], 16)
122 atoms[i] = bytes([pct]) + item[2:]
123 except ValueError:
124 pass
125 return ntob('').join(atoms)
126
127 import cherrypy
128 from cherrypy._cpcompat import basestring, ntob, ntou
129 from cherrypy.lib import httputil
130
131
132 # -------------------------------- Processors -------------------------------- #
133
134 def process_urlencoded(entity):
135 """Read application/x-www-form-urlencoded data into entity.params."""
136 qs = entity.fp.read()
137 for charset in entity.attempt_charsets:
138 try:
139 params = {}
140 for aparam in qs.split(ntob('&')):
141 for pair in aparam.split(ntob(';')):
142 if not pair:
143 continue
144
145 atoms = pair.split(ntob('='), 1)
146 if len(atoms) == 1:
147 atoms.append(ntob(''))
148
149 key = unquote_plus(atoms[0]).decode(charset)
150 value = unquote_plus(atoms[1]).decode(charset)
151
152 if key in params:
153 if not isinstance(params[key], list):
154 params[key] = [params[key]]
155 params[key].append(value)
156 else:
157 params[key] = value
158 except UnicodeDecodeError:
159 pass
160 else:
161 entity.charset = charset
162 break
163 else:
164 raise cherrypy.HTTPError(
165 400, "The request entity could not be decoded. The following "
166 "charsets were attempted: %s" % repr(entity.attempt_charsets))
167
168 # Now that all values have been successfully parsed and decoded,
169 # apply them to the entity.params dict.
170 for key, value in params.items():
171 if key in entity.params:
172 if not isinstance(entity.params[key], list):
173 entity.params[key] = [entity.params[key]]
174 entity.params[key].append(value)
175 else:
176 entity.params[key] = value
177
178
179 def process_multipart(entity):
180 """Read all multipart parts into entity.parts."""
181 ib = ""
182 if 'boundary' in entity.content_type.params:
183 # http://tools.ietf.org/html/rfc2046#section-5.1.1
184 # "The grammar for parameters on the Content-type field is such that it
185 # is often necessary to enclose the boundary parameter values in quotes
186 # on the Content-type line"
187 ib = entity.content_type.params['boundary'].strip('"')
188
189 if not re.match("^[ -~]{0,200}[!-~]$", ib):
190 raise ValueError('Invalid boundary in multipart form: %r' % (ib,))
191
192 ib = ('--' + ib).encode('ascii')
193
194 # Find the first marker
195 while True:
196 b = entity.readline()
197 if not b:
198 return
199
200 b = b.strip()
201 if b == ib:
202 break
203
204 # Read all parts
205 while True:
206 part = entity.part_class.from_fp(entity.fp, ib)
207 entity.parts.append(part)
208 part.process()
209 if part.fp.done:
210 break
211
212 def process_multipart_form_data(entity):
213 """Read all multipart/form-data parts into entity.parts or entity.params."""
214 process_multipart(entity)
215
216 kept_parts = []
217 for part in entity.parts:
218 if part.name is None:
219 kept_parts.append(part)
220 else:
221 if part.filename is None:
222 # It's a regular field
223 value = part.fullvalue()
224 else:
225 # It's a file upload. Retain the whole part so consumer code
226 # has access to its .file and .filename attributes.
227 value = part
228
229 if part.name in entity.params:
230 if not isinstance(entity.params[part.name], list):
231 entity.params[part.name] = [entity.params[part.name]]
232 entity.params[part.name].append(value)
233 else:
234 entity.params[part.name] = value
235
236 entity.parts = kept_parts
237
238 def _old_process_multipart(entity):
239 """The behavior of 3.2 and lower. Deprecated and will be changed in 3.3."""
240 process_multipart(entity)
241
242 params = entity.params
243
244 for part in entity.parts:
245 if part.name is None:
246 key = ntou('parts')
247 else:
248 key = part.name
249
250 if part.filename is None:
251 # It's a regular field
252 value = part.fullvalue()
253 else:
254 # It's a file upload. Retain the whole part so consumer code
255 # has access to its .file and .filename attributes.
256 value = part
257
258 if key in params:
259 if not isinstance(params[key], list):
260 params[key] = [params[key]]
261 params[key].append(value)
262 else:
263 params[key] = value
264
265
266
267 # --------------------------------- Entities --------------------------------- #
268
269
270 class Entity(object):
271 """An HTTP request body, or MIME multipart body.
272
273 This class collects information about the HTTP request entity. When a
274 given entity is of MIME type "multipart", each part is parsed into its own
275 Entity instance, and the set of parts stored in
276 :attr:`entity.parts<cherrypy._cpreqbody.Entity.parts>`.
277
278 Between the ``before_request_body`` and ``before_handler`` tools, CherryPy
279 tries to process the request body (if any) by calling
280 :func:`request.body.process<cherrypy._cpreqbody.RequestBody.process`.
281 This uses the ``content_type`` of the Entity to look up a suitable processor
282 in :attr:`Entity.processors<cherrypy._cpreqbody.Entity.processors>`, a dict.
283 If a matching processor cannot be found for the complete Content-Type,
284 it tries again using the major type. For example, if a request with an
285 entity of type "image/jpeg" arrives, but no processor can be found for
286 that complete type, then one is sought for the major type "image". If a
287 processor is still not found, then the
288 :func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>` method of the
289 Entity is called (which does nothing by default; you can override this too).
290
291 CherryPy includes processors for the "application/x-www-form-urlencoded"
292 type, the "multipart/form-data" type, and the "multipart" major type.
293 CherryPy 3.2 processes these types almost exactly as older versions.
294 Parts are passed as arguments to the page handler using their
295 ``Content-Disposition.name`` if given, otherwise in a generic "parts"
296 argument. Each such part is either a string, or the
297 :class:`Part<cherrypy._cpreqbody.Part>` itself if it's a file. (In this
298 case it will have ``file`` and ``filename`` attributes, or possibly a
299 ``value`` attribute). Each Part is itself a subclass of
300 Entity, and has its own ``process`` method and ``processors`` dict.
301
302 There is a separate processor for the "multipart" major type which is more
303 flexible, and simply stores all multipart parts in
304 :attr:`request.body.parts<cherrypy._cpreqbody.Entity.parts>`. You can
305 enable it with::
306
307 cherrypy.request.body.processors['multipart'] = _cpreqbody.process_multi part
308
309 in an ``on_start_resource`` tool.
310 """
311
312 # http://tools.ietf.org/html/rfc2046#section-4.1.2:
313 # "The default character set, which must be assumed in the
314 # absence of a charset parameter, is US-ASCII."
315 # However, many browsers send data in utf-8 with no charset.
316 attempt_charsets = ['utf-8']
317 """A list of strings, each of which should be a known encoding.
318
319 When the Content-Type of the request body warrants it, each of the given
320 encodings will be tried in order. The first one to successfully decode the
321 entity without raising an error is stored as
322 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults
323 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by
324 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_ ),
325 but ``['us-ascii', 'utf-8']`` for multipart parts.
326 """
327
328 charset = None
329 """The successful decoding; see "attempt_charsets" above."""
330
331 content_type = None
332 """The value of the Content-Type request header.
333
334 If the Entity is part of a multipart payload, this will be the Content-Type
335 given in the MIME headers for this part.
336 """
337
338 default_content_type = 'application/x-www-form-urlencoded'
339 """This defines a default ``Content-Type`` to use if no Content-Type header
340 is given. The empty string is used for RequestBody, which results in the
341 request body not being read or parsed at all. This is by design; a missing
342 ``Content-Type`` header in the HTTP request entity is an error at best,
343 and a security hole at worst. For multipart parts, however, the MIME spec
344 declares that a part with no Content-Type defaults to "text/plain"
345 (see :class:`Part<cherrypy._cpreqbody.Part>`).
346 """
347
348 filename = None
349 """The ``Content-Disposition.filename`` header, if available."""
350
351 fp = None
352 """The readable socket file object."""
353
354 headers = None
355 """A dict of request/multipart header names and values.
356
357 This is a copy of the ``request.headers`` for the ``request.body``;
358 for multipart parts, it is the set of headers for that part.
359 """
360
361 length = None
362 """The value of the ``Content-Length`` header, if provided."""
363
364 name = None
365 """The "name" parameter of the ``Content-Disposition`` header, if any."""
366
367 params = None
368 """
369 If the request Content-Type is 'application/x-www-form-urlencoded' or
370 multipart, this will be a dict of the params pulled from the entity
371 body; that is, it will be the portion of request.params that come
372 from the message body (sometimes called "POST params", although they
373 can be sent with various HTTP method verbs). This value is set between
374 the 'before_request_body' and 'before_handler' hooks (assuming that
375 process_request_body is True)."""
376
377 processors = {'application/x-www-form-urlencoded': process_urlencoded,
378 'multipart/form-data': process_multipart_form_data,
379 'multipart': process_multipart,
380 }
381 """A dict of Content-Type names to processor methods."""
382
383 parts = None
384 """A list of Part instances if ``Content-Type`` is of major type "multipart" ."""
385
386 part_class = None
387 """The class used for multipart parts.
388
389 You can replace this with custom subclasses to alter the processing of
390 multipart parts.
391 """
392
393 def __init__(self, fp, headers, params=None, parts=None):
394 # Make an instance-specific copy of the class processors
395 # so Tools, etc. can replace them per-request.
396 self.processors = self.processors.copy()
397
398 self.fp = fp
399 self.headers = headers
400
401 if params is None:
402 params = {}
403 self.params = params
404
405 if parts is None:
406 parts = []
407 self.parts = parts
408
409 # Content-Type
410 self.content_type = headers.elements('Content-Type')
411 if self.content_type:
412 self.content_type = self.content_type[0]
413 else:
414 self.content_type = httputil.HeaderElement.from_str(
415 self.default_content_type)
416
417 # Copy the class 'attempt_charsets', prepending any Content-Type charset
418 dec = self.content_type.params.get("charset", None)
419 if dec:
420 self.attempt_charsets = [dec] + [c for c in self.attempt_charsets
421 if c != dec]
422 else:
423 self.attempt_charsets = self.attempt_charsets[:]
424
425 # Length
426 self.length = None
427 clen = headers.get('Content-Length', None)
428 # If Transfer-Encoding is 'chunked', ignore any Content-Length.
429 if clen is not None and 'chunked' not in headers.get('Transfer-Encoding' , ''):
430 try:
431 self.length = int(clen)
432 except ValueError:
433 pass
434
435 # Content-Disposition
436 self.name = None
437 self.filename = None
438 disp = headers.elements('Content-Disposition')
439 if disp:
440 disp = disp[0]
441 if 'name' in disp.params:
442 self.name = disp.params['name']
443 if self.name.startswith('"') and self.name.endswith('"'):
444 self.name = self.name[1:-1]
445 if 'filename' in disp.params:
446 self.filename = disp.params['filename']
447 if self.filename.startswith('"') and self.filename.endswith('"') :
448 self.filename = self.filename[1:-1]
449
450 # The 'type' attribute is deprecated in 3.2; remove it in 3.3.
451 type = property(lambda self: self.content_type,
452 doc="""A deprecated alias for :attr:`content_type<cherrypy._cpreqbody.En tity.content_type>`.""")
453
454 def read(self, size=None, fp_out=None):
455 return self.fp.read(size, fp_out)
456
457 def readline(self, size=None):
458 return self.fp.readline(size)
459
460 def readlines(self, sizehint=None):
461 return self.fp.readlines(sizehint)
462
463 def __iter__(self):
464 return self
465
466 def __next__(self):
467 line = self.readline()
468 if not line:
469 raise StopIteration
470 return line
471
472 def next(self):
473 return self.__next__()
474
475 def read_into_file(self, fp_out=None):
476 """Read the request body into fp_out (or make_file() if None). Return fp _out."""
477 if fp_out is None:
478 fp_out = self.make_file()
479 self.read(fp_out=fp_out)
480 return fp_out
481
482 def make_file(self):
483 """Return a file-like object into which the request body will be read.
484
485 By default, this will return a TemporaryFile. Override as needed.
486 See also :attr:`cherrypy._cpreqbody.Part.maxrambytes`."""
487 return tempfile.TemporaryFile()
488
489 def fullvalue(self):
490 """Return this entity as a string, whether stored in a file or not."""
491 if self.file:
492 # It was stored in a tempfile. Read it.
493 self.file.seek(0)
494 value = self.file.read()
495 self.file.seek(0)
496 else:
497 value = self.value
498 return value
499
500 def process(self):
501 """Execute the best-match processor for the given media type."""
502 proc = None
503 ct = self.content_type.value
504 try:
505 proc = self.processors[ct]
506 except KeyError:
507 toptype = ct.split('/', 1)[0]
508 try:
509 proc = self.processors[toptype]
510 except KeyError:
511 pass
512 if proc is None:
513 self.default_proc()
514 else:
515 proc(self)
516
517 def default_proc(self):
518 """Called if a more-specific processor is not found for the ``Content-Ty pe``."""
519 # Leave the fp alone for someone else to read. This works fine
520 # for request.body, but the Part subclasses need to override this
521 # so they can move on to the next part.
522 pass
523
524
525 class Part(Entity):
526 """A MIME part entity, part of a multipart entity."""
527
528 # "The default character set, which must be assumed in the absence of a
529 # charset parameter, is US-ASCII."
530 attempt_charsets = ['us-ascii', 'utf-8']
531 """A list of strings, each of which should be a known encoding.
532
533 When the Content-Type of the request body warrants it, each of the given
534 encodings will be tried in order. The first one to successfully decode the
535 entity without raising an error is stored as
536 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults
537 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by
538 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_ ),
539 but ``['us-ascii', 'utf-8']`` for multipart parts.
540 """
541
542 boundary = None
543 """The MIME multipart boundary."""
544
545 default_content_type = 'text/plain'
546 """This defines a default ``Content-Type`` to use if no Content-Type header
547 is given. The empty string is used for RequestBody, which results in the
548 request body not being read or parsed at all. This is by design; a missing
549 ``Content-Type`` header in the HTTP request entity is an error at best,
550 and a security hole at worst. For multipart parts, however (this class),
551 the MIME spec declares that a part with no Content-Type defaults to
552 "text/plain".
553 """
554
555 # This is the default in stdlib cgi. We may want to increase it.
556 maxrambytes = 1000
557 """The threshold of bytes after which point the ``Part`` will store its data
558 in a file (generated by :func:`make_file<cherrypy._cprequest.Entity.make_fil e>`)
559 instead of a string. Defaults to 1000, just like the :mod:`cgi` module in
560 Python's standard library.
561 """
562
563 def __init__(self, fp, headers, boundary):
564 Entity.__init__(self, fp, headers)
565 self.boundary = boundary
566 self.file = None
567 self.value = None
568
569 def from_fp(cls, fp, boundary):
570 headers = cls.read_headers(fp)
571 return cls(fp, headers, boundary)
572 from_fp = classmethod(from_fp)
573
574 def read_headers(cls, fp):
575 headers = httputil.HeaderMap()
576 while True:
577 line = fp.readline()
578 if not line:
579 # No more data--illegal end of headers
580 raise EOFError("Illegal end of headers.")
581
582 if line == ntob('\r\n'):
583 # Normal end of headers
584 break
585 if not line.endswith(ntob('\r\n')):
586 raise ValueError("MIME requires CRLF terminators: %r" % line)
587
588 if line[0] in ntob(' \t'):
589 # It's a continuation line.
590 v = line.strip().decode('ISO-8859-1')
591 else:
592 k, v = line.split(ntob(":"), 1)
593 k = k.strip().decode('ISO-8859-1')
594 v = v.strip().decode('ISO-8859-1')
595
596 existing = headers.get(k)
597 if existing:
598 v = ", ".join((existing, v))
599 headers[k] = v
600
601 return headers
602 read_headers = classmethod(read_headers)
603
604 def read_lines_to_boundary(self, fp_out=None):
605 """Read bytes from self.fp and return or write them to a file.
606
607 If the 'fp_out' argument is None (the default), all bytes read are
608 returned in a single byte string.
609
610 If the 'fp_out' argument is not None, it must be a file-like object that
611 supports the 'write' method; all bytes read will be written to the fp,
612 and that fp is returned.
613 """
614 endmarker = self.boundary + ntob("--")
615 delim = ntob("")
616 prev_lf = True
617 lines = []
618 seen = 0
619 while True:
620 line = self.fp.readline(1<<16)
621 if not line:
622 raise EOFError("Illegal end of multipart body.")
623 if line.startswith(ntob("--")) and prev_lf:
624 strippedline = line.strip()
625 if strippedline == self.boundary:
626 break
627 if strippedline == endmarker:
628 self.fp.finish()
629 break
630
631 line = delim + line
632
633 if line.endswith(ntob("\r\n")):
634 delim = ntob("\r\n")
635 line = line[:-2]
636 prev_lf = True
637 elif line.endswith(ntob("\n")):
638 delim = ntob("\n")
639 line = line[:-1]
640 prev_lf = True
641 else:
642 delim = ntob("")
643 prev_lf = False
644
645 if fp_out is None:
646 lines.append(line)
647 seen += len(line)
648 if seen > self.maxrambytes:
649 fp_out = self.make_file()
650 for line in lines:
651 fp_out.write(line)
652 else:
653 fp_out.write(line)
654
655 if fp_out is None:
656 result = ntob('').join(lines)
657 for charset in self.attempt_charsets:
658 try:
659 result = result.decode(charset)
660 except UnicodeDecodeError:
661 pass
662 else:
663 self.charset = charset
664 return result
665 else:
666 raise cherrypy.HTTPError(
667 400, "The request entity could not be decoded. The following "
668 "charsets were attempted: %s" % repr(self.attempt_charsets))
669 else:
670 fp_out.seek(0)
671 return fp_out
672
673 def default_proc(self):
674 """Called if a more-specific processor is not found for the ``Content-Ty pe``."""
675 if self.filename:
676 # Always read into a file if a .filename was given.
677 self.file = self.read_into_file()
678 else:
679 result = self.read_lines_to_boundary()
680 if isinstance(result, basestring):
681 self.value = result
682 else:
683 self.file = result
684
685 def read_into_file(self, fp_out=None):
686 """Read the request body into fp_out (or make_file() if None). Return fp _out."""
687 if fp_out is None:
688 fp_out = self.make_file()
689 self.read_lines_to_boundary(fp_out=fp_out)
690 return fp_out
691
692 Entity.part_class = Part
693
694 try:
695 inf = float('inf')
696 except ValueError:
697 # Python 2.4 and lower
698 class Infinity(object):
699 def __cmp__(self, other):
700 return 1
701 def __sub__(self, other):
702 return self
703 inf = Infinity()
704
705
706 comma_separated_headers = ['Accept', 'Accept-Charset', 'Accept-Encoding',
707 'Accept-Language', 'Accept-Ranges', 'Allow', 'Cache-Control', 'Connection',
708 'Content-Encoding', 'Content-Language', 'Expect', 'If-Match',
709 'If-None-Match', 'Pragma', 'Proxy-Authenticate', 'Te', 'Trailer',
710 'Transfer-Encoding', 'Upgrade', 'Vary', 'Via', 'Warning', 'Www-Authenticate' ]
711
712
713 class SizedReader:
714
715 def __init__(self, fp, length, maxbytes, bufsize=DEFAULT_BUFFER_SIZE, has_tr ailers=False):
716 # Wrap our fp in a buffer so peek() works
717 self.fp = fp
718 self.length = length
719 self.maxbytes = maxbytes
720 self.buffer = ntob('')
721 self.bufsize = bufsize
722 self.bytes_read = 0
723 self.done = False
724 self.has_trailers = has_trailers
725
726 def read(self, size=None, fp_out=None):
727 """Read bytes from the request body and return or write them to a file.
728
729 A number of bytes less than or equal to the 'size' argument are read
730 off the socket. The actual number of bytes read are tracked in
731 self.bytes_read. The number may be smaller than 'size' when 1) the
732 client sends fewer bytes, 2) the 'Content-Length' request header
733 specifies fewer bytes than requested, or 3) the number of bytes read
734 exceeds self.maxbytes (in which case, 413 is raised).
735
736 If the 'fp_out' argument is None (the default), all bytes read are
737 returned in a single byte string.
738
739 If the 'fp_out' argument is not None, it must be a file-like object that
740 supports the 'write' method; all bytes read will be written to the fp,
741 and None is returned.
742 """
743
744 if self.length is None:
745 if size is None:
746 remaining = inf
747 else:
748 remaining = size
749 else:
750 remaining = self.length - self.bytes_read
751 if size and size < remaining:
752 remaining = size
753 if remaining == 0:
754 self.finish()
755 if fp_out is None:
756 return ntob('')
757 else:
758 return None
759
760 chunks = []
761
762 # Read bytes from the buffer.
763 if self.buffer:
764 if remaining is inf:
765 data = self.buffer
766 self.buffer = ntob('')
767 else:
768 data = self.buffer[:remaining]
769 self.buffer = self.buffer[remaining:]
770 datalen = len(data)
771 remaining -= datalen
772
773 # Check lengths.
774 self.bytes_read += datalen
775 if self.maxbytes and self.bytes_read > self.maxbytes:
776 raise cherrypy.HTTPError(413)
777
778 # Store the data.
779 if fp_out is None:
780 chunks.append(data)
781 else:
782 fp_out.write(data)
783
784 # Read bytes from the socket.
785 while remaining > 0:
786 chunksize = min(remaining, self.bufsize)
787 try:
788 data = self.fp.read(chunksize)
789 except Exception:
790 e = sys.exc_info()[1]
791 if e.__class__.__name__ == 'MaxSizeExceeded':
792 # Post data is too big
793 raise cherrypy.HTTPError(
794 413, "Maximum request length: %r" % e.args[1])
795 else:
796 raise
797 if not data:
798 self.finish()
799 break
800 datalen = len(data)
801 remaining -= datalen
802
803 # Check lengths.
804 self.bytes_read += datalen
805 if self.maxbytes and self.bytes_read > self.maxbytes:
806 raise cherrypy.HTTPError(413)
807
808 # Store the data.
809 if fp_out is None:
810 chunks.append(data)
811 else:
812 fp_out.write(data)
813
814 if fp_out is None:
815 return ntob('').join(chunks)
816
817 def readline(self, size=None):
818 """Read a line from the request body and return it."""
819 chunks = []
820 while size is None or size > 0:
821 chunksize = self.bufsize
822 if size is not None and size < self.bufsize:
823 chunksize = size
824 data = self.read(chunksize)
825 if not data:
826 break
827 pos = data.find(ntob('\n')) + 1
828 if pos:
829 chunks.append(data[:pos])
830 remainder = data[pos:]
831 self.buffer += remainder
832 self.bytes_read -= len(remainder)
833 break
834 else:
835 chunks.append(data)
836 return ntob('').join(chunks)
837
838 def readlines(self, sizehint=None):
839 """Read lines from the request body and return them."""
840 if self.length is not None:
841 if sizehint is None:
842 sizehint = self.length - self.bytes_read
843 else:
844 sizehint = min(sizehint, self.length - self.bytes_read)
845
846 lines = []
847 seen = 0
848 while True:
849 line = self.readline()
850 if not line:
851 break
852 lines.append(line)
853 seen += len(line)
854 if seen >= sizehint:
855 break
856 return lines
857
858 def finish(self):
859 self.done = True
860 if self.has_trailers and hasattr(self.fp, 'read_trailer_lines'):
861 self.trailers = {}
862
863 try:
864 for line in self.fp.read_trailer_lines():
865 if line[0] in ntob(' \t'):
866 # It's a continuation line.
867 v = line.strip()
868 else:
869 try:
870 k, v = line.split(ntob(":"), 1)
871 except ValueError:
872 raise ValueError("Illegal header line.")
873 k = k.strip().title()
874 v = v.strip()
875
876 if k in comma_separated_headers:
877 existing = self.trailers.get(envname)
878 if existing:
879 v = ntob(", ").join((existing, v))
880 self.trailers[k] = v
881 except Exception:
882 e = sys.exc_info()[1]
883 if e.__class__.__name__ == 'MaxSizeExceeded':
884 # Post data is too big
885 raise cherrypy.HTTPError(
886 413, "Maximum request length: %r" % e.args[1])
887 else:
888 raise
889
890
891 class RequestBody(Entity):
892 """The entity of the HTTP request."""
893
894 bufsize = 8 * 1024
895 """The buffer size used when reading the socket."""
896
897 # Don't parse the request body at all if the client didn't provide
898 # a Content-Type header. See http://www.cherrypy.org/ticket/790
899 default_content_type = ''
900 """This defines a default ``Content-Type`` to use if no Content-Type header
901 is given. The empty string is used for RequestBody, which results in the
902 request body not being read or parsed at all. This is by design; a missing
903 ``Content-Type`` header in the HTTP request entity is an error at best,
904 and a security hole at worst. For multipart parts, however, the MIME spec
905 declares that a part with no Content-Type defaults to "text/plain"
906 (see :class:`Part<cherrypy._cpreqbody.Part>`).
907 """
908
909 maxbytes = None
910 """Raise ``MaxSizeExceeded`` if more bytes than this are read from the socke t."""
911
912 def __init__(self, fp, headers, params=None, request_params=None):
913 Entity.__init__(self, fp, headers, params)
914
915 # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1
916 # When no explicit charset parameter is provided by the
917 # sender, media subtypes of the "text" type are defined
918 # to have a default charset value of "ISO-8859-1" when
919 # received via HTTP.
920 if self.content_type.value.startswith('text/'):
921 for c in ('ISO-8859-1', 'iso-8859-1', 'Latin-1', 'latin-1'):
922 if c in self.attempt_charsets:
923 break
924 else:
925 self.attempt_charsets.append('ISO-8859-1')
926
927 # Temporary fix while deprecating passing .parts as .params.
928 self.processors['multipart'] = _old_process_multipart
929
930 if request_params is None:
931 request_params = {}
932 self.request_params = request_params
933
934 def process(self):
935 """Process the request entity based on its Content-Type."""
936 # "The presence of a message-body in a request is signaled by the
937 # inclusion of a Content-Length or Transfer-Encoding header field in
938 # the request's message-headers."
939 # It is possible to send a POST request with no body, for example;
940 # however, app developers are responsible in that case to set
941 # cherrypy.request.process_body to False so this method isn't called.
942 h = cherrypy.serving.request.headers
943 if 'Content-Length' not in h and 'Transfer-Encoding' not in h:
944 raise cherrypy.HTTPError(411)
945
946 self.fp = SizedReader(self.fp, self.length,
947 self.maxbytes, bufsize=self.bufsize,
948 has_trailers='Trailer' in h)
949 super(RequestBody, self).process()
950
951 # Body params should also be a part of the request_params
952 # add them in here.
953 request_params = self.request_params
954 for key, value in self.params.items():
955 # Python 2 only: keyword arguments must be byte strings (type 'str') .
956 if sys.version_info < (3, 0):
957 if isinstance(key, unicode):
958 key = key.encode('ISO-8859-1')
959
960 if key in request_params:
961 if not isinstance(request_params[key], list):
962 request_params[key] = [request_params[key]]
963 request_params[key].append(value)
964 else:
965 request_params[key] = value
OLDNEW
« no previous file with comments | « third_party/cherrypy/_cpnative_server.py ('k') | third_party/cherrypy/_cprequest.py » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698