Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(205)

Issue 17858002: ARM: Implement memcpy using NEON. (Closed)

Created:
7 years, 5 months ago by Rodolph Perfetta
Modified:
7 years, 5 months ago
CC:
v8-dev
Visibility:
Public.

Description

ARM: Implement memcpy using NEON. Add support for a few NEON and ARM SIMD instructions and use them for various memcpy operations. BUG=none TEST=none Committed: https://code.google.com/p/v8/source/detail?r=15602

Patch Set 1 #

Patch Set 2 : Addressed lint issues. #

Total comments: 7

Patch Set 3 : Remove "unaligned accesses" from C++ code #

Total comments: 10
Unified diffs Side-by-side diffs Delta from patch set Stats (+1651 lines, -62 lines) Patch
M src/arm/assembler-arm.h View 7 chunks +111 lines, -0 lines 0 comments Download
M src/arm/assembler-arm.cc View 1 2 9 chunks +250 lines, -2 lines 4 comments Download
M src/arm/codegen-arm.cc View 1 2 1 chunk +245 lines, -0 lines 4 comments Download
M src/arm/constants-arm.h View 3 chunks +27 lines, -16 lines 0 comments Download
M src/arm/disasm-arm.cc View 1 2 6 chunks +223 lines, -8 lines 0 comments Download
M src/arm/simulator-arm.h View 3 chunks +14 lines, -1 line 0 comments Download
M src/arm/simulator-arm.cc View 1 2 4 chunks +337 lines, -27 lines 0 comments Download
M src/flag-definitions.h View 1 2 1 chunk +2 lines, -0 lines 0 comments Download
M src/globals.h View 1 2 2 chunks +10 lines, -6 lines 0 comments Download
M src/platform.h View 1 2 2 chunks +39 lines, -1 line 0 comments Download
M src/platform-linux.cc View 1 2 2 chunks +33 lines, -0 lines 0 comments Download
M src/platform-nullos.cc View 1 chunk +5 lines, -0 lines 0 comments Download
M src/platform-posix.cc View 1 2 2 chunks +25 lines, -1 line 0 comments Download
M src/v8globals.h View 2 chunks +12 lines, -0 lines 0 comments Download
M src/v8utils.h View 1 2 2 chunks +104 lines, -0 lines 2 comments Download
M test/cctest/test-assembler-arm.cc View 1 chunk +181 lines, -0 lines 0 comments Download
M test/cctest/test-disasm-arm.cc View 3 chunks +33 lines, -0 lines 0 comments Download

Messages

Total messages: 6 (0 generated)
Rodolph Perfetta
7 years, 5 months ago (2013-06-26 11:19:30 UTC) #1
ulan
https://codereview.chromium.org/17858002/diff/14001/src/arm/codegen-arm.cc File src/arm/codegen-arm.cc (right): https://codereview.chromium.org/17858002/diff/14001/src/arm/codegen-arm.cc#newcode115 src/arm/codegen-arm.cc:115: #if defined(V8_HOST_ARCH_ARM) && defined (V8_HOST_CAN_READ_UNALIGNED) It looks like V8_HOST_CAN_READ_UNALIGNED ...
7 years, 5 months ago (2013-06-28 15:07:42 UTC) #2
Rodolph Perfetta
ARMv7 CPU can perform unaligned accesses for ldr, ldrh but not ldrd and ldm. There ...
7 years, 5 months ago (2013-07-03 16:19:28 UTC) #3
Rodolph Perfetta
https://codereview.chromium.org/17858002/diff/14001/src/arm/disasm-arm.cc File src/arm/disasm-arm.cc (right): https://codereview.chromium.org/17858002/diff/14001/src/arm/disasm-arm.cc#newcode446 src/arm/disasm-arm.cc:446: ":%d", (1<<align)<<6); On 2013/06/28 15:07:43, ulan wrote: > Spaces ...
7 years, 5 months ago (2013-07-03 16:59:38 UTC) #4
ulan
LGTM with nits. https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/assembler-arm.cc File src/arm/assembler-arm.cc (right): https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/assembler-arm.cc#newcode398 src/arm/assembler-arm.cc:398: switch (align) { Extracting this switch ...
7 years, 5 months ago (2013-07-09 15:16:32 UTC) #5
vincent.belliard.fr
7 years, 5 months ago (2013-07-10 15:30:37 UTC) #6
https://chromiumcodereview.appspot.com/17858002/diff/14001/src/arm/disasm-arm.cc
File src/arm/disasm-arm.cc (right):

https://chromiumcodereview.appspot.com/17858002/diff/14001/src/arm/disasm-arm...
src/arm/disasm-arm.cc:446: ":%d", (1<<align)<<6);
On 2013/06/28 15:07:43, ulan wrote:
> Spaces around "<<".

Done.

https://chromiumcodereview.appspot.com/17858002/diff/14001/src/platform.h
File src/platform.h (right):

https://chromiumcodereview.appspot.com/17858002/diff/14001/src/platform.h#new...
src/platform.h:380: const uint8_t* src,
On 2013/06/28 15:07:43, ulan wrote:
> Indentation is off.

Done.

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/assembler-...
File src/arm/assembler-arm.cc (right):

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/assembler-...
src/arm/assembler-arm.cc:398: switch (align) {
On 2013/07/09 15:16:32, ulan wrote:
> Extracting this switch into a function would avoid code duplication below.

Done.

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/assembler-...
src/arm/assembler-arm.cc:1832: void Assembler::pld(const MemOperand& address) {
On 2013/07/09 15:16:32, ulan wrote:
> Missing the description comment.

Done.

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/codegen-ar...
File src/arm/codegen-arm.cc (right):

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/codegen-ar...
src/arm/codegen-arm.cc:137: Label loop, less_256, less_128, less_64, less_32,
_16_or_less, _8_or_less;
With ARM, pld starts a load but doesn't wait for the result. If we only preload
64 bytes, it wouldn't speedup anything because we would do the load just after
the preload (in fact in would add a little overhead). We need a preload of 256
to have the best result.
However, loading too much data would slowdown. That why we try to preload
exactly what we need (we only preload up to 63 extra bytes).

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/arm/codegen-ar...
src/arm/codegen-arm.cc:173: __ pld(MemOperand(src, 256));
On 2013/07/09 15:16:32, ulan wrote:
> Shouldn't this be __ pld(MemOperand(src, 256 - 32)) since src advanced by 64?

Done.

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/v8utils.h
File src/v8utils.h (right):

https://chromiumcodereview.appspot.com/17858002/diff/25001/src/v8utils.h#newc...
src/v8utils.h:431: memcpy(dest, src, 15);
This gives good improvement. For small known sizes, gcc generates the good
instructions (ldrb, ldrh, ldr) to copy the needed size.

Powered by Google App Engine
This is Rietveld 408576698