Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(51)

Side by Side Diff: src/x64/regexp-macro-assembler-x64.cc

Issue 10386090: Implement loop for global regexps in regexp assembler. (Closed) Base URL: https://v8.googlecode.com/svn/branches/bleeding_edge
Patch Set: fix bugs, add tests, port to x64 and arm. Created 8 years, 7 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
OLDNEW
1 // Copyright 2011 the V8 project authors. All rights reserved. 1 // Copyright 2012 the V8 project authors. All rights reserved.
2 // Redistribution and use in source and binary forms, with or without 2 // Redistribution and use in source and binary forms, with or without
3 // modification, are permitted provided that the following conditions are 3 // modification, are permitted provided that the following conditions are
4 // met: 4 // met:
5 // 5 //
6 // * Redistributions of source code must retain the above copyright 6 // * Redistributions of source code must retain the above copyright
7 // notice, this list of conditions and the following disclaimer. 7 // notice, this list of conditions and the following disclaimer.
8 // * Redistributions in binary form must reproduce the above 8 // * Redistributions in binary form must reproduce the above
9 // copyright notice, this list of conditions and the following 9 // copyright notice, this list of conditions and the following
10 // disclaimer in the documentation and/or other materials provided 10 // disclaimer in the documentation and/or other materials provided
11 // with the distribution. 11 // with the distribution.
(...skipping 58 matching lines...) Expand 10 before | Expand all | Expand 10 after
70 * 70 *
71 * Each call to a C++ method should retain these registers. 71 * Each call to a C++ method should retain these registers.
72 * 72 *
73 * The stack will have the following content, in some order, indexable from the 73 * The stack will have the following content, in some order, indexable from the
74 * frame pointer (see, e.g., kStackHighEnd): 74 * frame pointer (see, e.g., kStackHighEnd):
75 * - Isolate* isolate (Address of the current isolate) 75 * - Isolate* isolate (Address of the current isolate)
76 * - direct_call (if 1, direct call from JavaScript code, if 0 call 76 * - direct_call (if 1, direct call from JavaScript code, if 0 call
77 * through the runtime system) 77 * through the runtime system)
78 * - stack_area_base (High end of the memory area to use as 78 * - stack_area_base (High end of the memory area to use as
79 * backtracking stack) 79 * backtracking stack)
80 * - capture array size (may fit multiple sets of matches)
80 * - int* capture_array (int[num_saved_registers_], for output). 81 * - int* capture_array (int[num_saved_registers_], for output).
81 * - end of input (Address of end of string) 82 * - end of input (Address of end of string)
82 * - start of input (Address of first character in string) 83 * - start of input (Address of first character in string)
83 * - start index (character index of start) 84 * - start index (character index of start)
84 * - String* input_string (input string) 85 * - String* input_string (input string)
85 * - return address 86 * - return address
86 * - backup of callee save registers (rbx, possibly rsi and rdi). 87 * - backup of callee save registers (rbx, possibly rsi and rdi).
88 * - success counter (only useful for global regexp to count matches)
87 * - Offset of location before start of input (effectively character 89 * - Offset of location before start of input (effectively character
88 * position -1). Used to initialize capture registers to a non-position. 90 * position -1). Used to initialize capture registers to a non-position.
89 * - At start of string (if 1, we are starting at the start of the 91 * - At start of string (if 1, we are starting at the start of the
90 * string, otherwise 0) 92 * string, otherwise 0)
91 * - register 0 rbp[-n] (Only positions must be stored in the first 93 * - register 0 rbp[-n] (Only positions must be stored in the first
92 * - register 1 rbp[-n-8] num_saved_registers_ registers) 94 * - register 1 rbp[-n-8] num_saved_registers_ registers)
93 * - ... 95 * - ...
94 * 96 *
95 * The first num_saved_registers_ registers are initialized to point to 97 * The first num_saved_registers_ registers are initialized to point to
96 * "character -1" in the string (i.e., char_size() bytes before the first 98 * "character -1" in the string (i.e., char_size() bytes before the first
(...skipping 640 matching lines...) Expand 10 before | Expand all | Expand 10 after
737 // Match any character. 739 // Match any character.
738 return true; 740 return true;
739 // No custom implementation (yet): s(UC16), S(UC16). 741 // No custom implementation (yet): s(UC16), S(UC16).
740 default: 742 default:
741 return false; 743 return false;
742 } 744 }
743 } 745 }
744 746
745 747
746 void RegExpMacroAssemblerX64::Fail() { 748 void RegExpMacroAssemblerX64::Fail() {
747 ASSERT(FAILURE == 0); // Return value for failure is zero. 749 STATIC_ASSERT(FAILURE == 0); // Return value for failure is zero.
748 __ Set(rax, 0); 750 if (!global()) {
751 __ Set(rax, FAILURE);
752 }
749 __ jmp(&exit_label_); 753 __ jmp(&exit_label_);
750 } 754 }
751 755
752 756
753 Handle<HeapObject> RegExpMacroAssemblerX64::GetCode(Handle<String> source) { 757 Handle<HeapObject> RegExpMacroAssemblerX64::GetCode(Handle<String> source) {
758 Label return_rax, restart;
754 // Finalize code - write the entry point code now we know how many 759 // Finalize code - write the entry point code now we know how many
755 // registers we need. 760 // registers we need.
756 // Entry code: 761 // Entry code:
757 __ bind(&entry_label_); 762 __ bind(&entry_label_);
758 763
759 // Tell the system that we have a stack frame. Because the type is MANUAL, no 764 // Tell the system that we have a stack frame. Because the type is MANUAL, no
760 // is generated. 765 // is generated.
761 FrameScope scope(&masm_, StackFrame::MANUAL); 766 FrameScope scope(&masm_, StackFrame::MANUAL);
762 767
763 // Actually emit code to start a new stack frame. 768 // Actually emit code to start a new stack frame.
(...skipping 13 matching lines...) Expand all
777 __ push(rdi); 782 __ push(rdi);
778 __ push(rbx); 783 __ push(rbx);
779 #else 784 #else
780 // GCC passes arguments in rdi, rsi, rdx, rcx, r8, r9 (and then on stack). 785 // GCC passes arguments in rdi, rsi, rdx, rcx, r8, r9 (and then on stack).
781 // Push register parameters on stack for reference. 786 // Push register parameters on stack for reference.
782 ASSERT_EQ(kInputString, -1 * kPointerSize); 787 ASSERT_EQ(kInputString, -1 * kPointerSize);
783 ASSERT_EQ(kStartIndex, -2 * kPointerSize); 788 ASSERT_EQ(kStartIndex, -2 * kPointerSize);
784 ASSERT_EQ(kInputStart, -3 * kPointerSize); 789 ASSERT_EQ(kInputStart, -3 * kPointerSize);
785 ASSERT_EQ(kInputEnd, -4 * kPointerSize); 790 ASSERT_EQ(kInputEnd, -4 * kPointerSize);
786 ASSERT_EQ(kRegisterOutput, -5 * kPointerSize); 791 ASSERT_EQ(kRegisterOutput, -5 * kPointerSize);
787 ASSERT_EQ(kStackHighEnd, -6 * kPointerSize); 792 ASSERT_EQ(kNumOutputRegisters, -6 * kPointerSize);
788 __ push(rdi); 793 __ push(rdi);
789 __ push(rsi); 794 __ push(rsi);
790 __ push(rdx); 795 __ push(rdx);
791 __ push(rcx); 796 __ push(rcx);
792 __ push(r8); 797 __ push(r8);
793 __ push(r9); 798 __ push(r9);
794 799
795 __ push(rbx); // Callee-save 800 __ push(rbx); // Callee-save
796 #endif 801 #endif
797 802
803 __ push(Immediate(0)); // Number of successful matches in a global regexp.
798 __ push(Immediate(0)); // Make room for "at start" constant. 804 __ push(Immediate(0)); // Make room for "at start" constant.
799 805
800 // Check if we have space on the stack for registers. 806 // Check if we have space on the stack for registers.
801 Label stack_limit_hit; 807 Label stack_limit_hit;
802 Label stack_ok; 808 Label stack_ok;
803 809
804 ExternalReference stack_limit = 810 ExternalReference stack_limit =
805 ExternalReference::address_of_stack_limit(masm_.isolate()); 811 ExternalReference::address_of_stack_limit(masm_.isolate());
806 __ movq(rcx, rsp); 812 __ movq(rcx, rsp);
807 __ movq(kScratchRegister, stack_limit); 813 __ movq(kScratchRegister, stack_limit);
808 __ subq(rcx, Operand(kScratchRegister, 0)); 814 __ subq(rcx, Operand(kScratchRegister, 0));
809 // Handle it if the stack pointer is already below the stack limit. 815 // Handle it if the stack pointer is already below the stack limit.
810 __ j(below_equal, &stack_limit_hit); 816 __ j(below_equal, &stack_limit_hit);
811 // Check if there is room for the variable number of registers above 817 // Check if there is room for the variable number of registers above
812 // the stack limit. 818 // the stack limit.
813 __ cmpq(rcx, Immediate(num_registers_ * kPointerSize)); 819 __ cmpq(rcx, Immediate(num_registers_ * kPointerSize));
814 __ j(above_equal, &stack_ok); 820 __ j(above_equal, &stack_ok);
815 // Exit with OutOfMemory exception. There is not enough space on the stack 821 // Exit with OutOfMemory exception. There is not enough space on the stack
816 // for our working registers. 822 // for our working registers.
817 __ Set(rax, EXCEPTION); 823 __ Set(rax, EXCEPTION);
818 __ jmp(&exit_label_); 824 __ jmp(&return_rax);
819 825
820 __ bind(&stack_limit_hit); 826 __ bind(&stack_limit_hit);
821 __ Move(code_object_pointer(), masm_.CodeObject()); 827 __ Move(code_object_pointer(), masm_.CodeObject());
822 CallCheckStackGuardState(); // Preserves no registers beside rbp and rsp. 828 CallCheckStackGuardState(); // Preserves no registers beside rbp and rsp.
823 __ testq(rax, rax); 829 __ testq(rax, rax);
824 // If returned value is non-zero, we exit with the returned value as result. 830 // If returned value is non-zero, we exit with the returned value as result.
825 __ j(not_zero, &exit_label_); 831 __ j(not_zero, &return_rax);
826 832
827 __ bind(&stack_ok); 833 __ bind(&stack_ok);
828 834
829 // Allocate space on stack for registers. 835 // Allocate space on stack for registers.
830 __ subq(rsp, Immediate(num_registers_ * kPointerSize)); 836 __ subq(rsp, Immediate(num_registers_ * kPointerSize));
831 // Load string length. 837 // Load string length.
832 __ movq(rsi, Operand(rbp, kInputEnd)); 838 __ movq(rsi, Operand(rbp, kInputEnd));
833 // Load input position. 839 // Load input position.
834 __ movq(rdi, Operand(rbp, kInputStart)); 840 __ movq(rdi, Operand(rbp, kInputStart));
835 // Set up rdi to be negative offset from string end. 841 // Set up rdi to be negative offset from string end.
836 __ subq(rdi, rsi); 842 __ subq(rdi, rsi);
837 // Set rax to address of char before start of the string 843 // Set rax to address of char before start of the string
838 // (effectively string position -1). 844 // (effectively string position -1).
839 __ movq(rbx, Operand(rbp, kStartIndex)); 845 __ movq(rbx, Operand(rbp, kStartIndex));
840 __ neg(rbx); 846 __ neg(rbx);
841 if (mode_ == UC16) { 847 if (mode_ == UC16) {
842 __ lea(rax, Operand(rdi, rbx, times_2, -char_size())); 848 __ lea(rax, Operand(rdi, rbx, times_2, -char_size()));
843 } else { 849 } else {
844 __ lea(rax, Operand(rdi, rbx, times_1, -char_size())); 850 __ lea(rax, Operand(rdi, rbx, times_1, -char_size()));
845 } 851 }
846 // Store this value in a local variable, for use when clearing 852 // Store this value in a local variable, for use when clearing
847 // position registers. 853 // position registers.
848 __ movq(Operand(rbp, kInputStartMinusOne), rax); 854 __ movq(Operand(rbp, kInputStartMinusOne), rax);
849 855
856 // Initialize code object pointer.
857 __ Move(code_object_pointer(), masm_.CodeObject());
858
859 // Load previous char as initial value of current-character.
860 Label at_start, character_loaded;
861 __ cmpb(Operand(rbp, kStartIndex), Immediate(0));
862 __ j(equal, &at_start, Label::kNear);
863 LoadCurrentCharacterUnchecked(-1, 1); // Load previous char.
864 __ jmp(&character_loaded, Label::kNear);
865 __ bind(&at_start);
866 __ Set(current_character(), '\n');
867
868 if (global()) {
869 // Initializing current-character already done, so skip it.
870 __ jmp(&character_loaded, Label::kNear);
871 // Restart matching from here in a global regexp.
872 __ bind(&restart);
873 // In a restarted pass, initialize current-character here.
874 LoadCurrentCharacterUnchecked(-1, 1);
875 }
876 __ bind(&character_loaded);
877
878 // Initialize on-stack registers.
850 if (num_saved_registers_ > 0) { 879 if (num_saved_registers_ > 0) {
851 // Fill saved registers with initial value = start offset - 1 880 // Fill saved registers with initial value = start offset - 1
852 // Fill in stack push order, to avoid accessing across an unwritten 881 // Fill in stack push order, to avoid accessing across an unwritten
853 // page (a problem on Windows). 882 // page (a problem on Windows).
854 __ Set(rcx, kRegisterZero); 883 if (num_saved_registers_ > 8) {
855 Label init_loop; 884 __ Set(rcx, kRegisterZero);
856 __ bind(&init_loop); 885 Label init_loop;
857 __ movq(Operand(rbp, rcx, times_1, 0), rax); 886 __ bind(&init_loop);
858 __ subq(rcx, Immediate(kPointerSize)); 887 __ movq(Operand(rbp, rcx, times_1, 0), rax);
859 __ cmpq(rcx, 888 __ subq(rcx, Immediate(kPointerSize));
860 Immediate(kRegisterZero - num_saved_registers_ * kPointerSize)); 889 __ cmpq(rcx,
861 __ j(greater, &init_loop); 890 Immediate(kRegisterZero - num_saved_registers_ * kPointerSize));
862 } 891 __ j(greater, &init_loop);
863 // Ensure that we have written to each stack page, in order. Skipping a page 892 } else { // Unroll the loop.
864 // on Windows can cause segmentation faults. Assuming page size is 4k. 893 for (int i = 0; i < num_saved_registers_; i++) {
865 const int kPageSize = 4096; 894 __ movq(register_location(i), rax);
866 const int kRegistersPerPage = kPageSize / kPointerSize; 895 }
867 for (int i = num_saved_registers_ + kRegistersPerPage - 1; 896 }
868 i < num_registers_;
869 i += kRegistersPerPage) {
870 __ movq(register_location(i), rax); // One write every page.
871 } 897 }
872 898
873 // Initialize backtrack stack pointer. 899 // Initialize backtrack stack pointer.
874 __ movq(backtrack_stackpointer(), Operand(rbp, kStackHighEnd)); 900 __ movq(backtrack_stackpointer(), Operand(rbp, kStackHighEnd));
875 // Initialize code object pointer. 901
876 __ Move(code_object_pointer(), masm_.CodeObject());
877 // Load previous char as initial value of current-character.
878 Label at_start;
879 __ cmpb(Operand(rbp, kStartIndex), Immediate(0));
880 __ j(equal, &at_start);
881 LoadCurrentCharacterUnchecked(-1, 1); // Load previous char.
882 __ jmp(&start_label_); 902 __ jmp(&start_label_);
883 __ bind(&at_start);
884 __ Set(current_character(), '\n');
885 __ jmp(&start_label_);
886
887 903
888 // Exit code: 904 // Exit code:
889 if (success_label_.is_linked()) { 905 if (success_label_.is_linked()) {
890 // Save captures when successful. 906 // Save captures when successful.
891 __ bind(&success_label_); 907 __ bind(&success_label_);
892 if (num_saved_registers_ > 0) { 908 if (num_saved_registers_ > 0) {
893 // copy captures to output 909 // copy captures to output
894 __ movq(rdx, Operand(rbp, kStartIndex)); 910 __ movq(rdx, Operand(rbp, kStartIndex));
895 __ movq(rbx, Operand(rbp, kRegisterOutput)); 911 __ movq(rbx, Operand(rbp, kRegisterOutput));
896 __ movq(rcx, Operand(rbp, kInputEnd)); 912 __ movq(rcx, Operand(rbp, kInputEnd));
897 __ subq(rcx, Operand(rbp, kInputStart)); 913 __ subq(rcx, Operand(rbp, kInputStart));
898 if (mode_ == UC16) { 914 if (mode_ == UC16) {
899 __ lea(rcx, Operand(rcx, rdx, times_2, 0)); 915 __ lea(rcx, Operand(rcx, rdx, times_2, 0));
900 } else { 916 } else {
901 __ addq(rcx, rdx); 917 __ addq(rcx, rdx);
902 } 918 }
903 for (int i = 0; i < num_saved_registers_; i++) { 919 for (int i = 0; i < num_saved_registers_; i++) {
904 __ movq(rax, register_location(i)); 920 __ movq(rax, register_location(i));
921 if (i == 0 && global()) {
922 // Keep capture start in rdx for the zero-length check later.
923 __ movq(rdx, rax);
924 }
905 __ addq(rax, rcx); // Convert to index from start, not end. 925 __ addq(rax, rcx); // Convert to index from start, not end.
906 if (mode_ == UC16) { 926 if (mode_ == UC16) {
907 __ sar(rax, Immediate(1)); // Convert byte index to character index. 927 __ sar(rax, Immediate(1)); // Convert byte index to character index.
908 } 928 }
909 __ movl(Operand(rbx, i * kIntSize), rax); 929 __ movl(Operand(rbx, i * kIntSize), rax);
910 } 930 }
911 } 931 }
912 __ Set(rax, SUCCESS); 932
933 if (global()) {
934 // Restart matching if the regular expression is flagged as global.
935 // Increment success counter.
936 __ incq(Operand(rbp, kSuccessfulCaptures));
937 // Capture results have been stored, so the number of remaining global
938 // output registers is reduced by the number of stored captures.
939 __ movq(rcx, Operand(rbp, kNumOutputRegisters));
940 __ subq(rcx, Immediate(num_saved_registers_));
941 // Check whether we have enough room for another set of capture results.
942 __ cmpq(rcx, Immediate(num_saved_registers_));
943 __ j(less, &exit_label_);
944
945 __ movq(Operand(rbp, kNumOutputRegisters), rcx);
946 // Advance the location for output.
947 __ addq(Operand(rbp, kRegisterOutput),
948 Immediate(num_saved_registers_ * kIntSize));
949
950 // Prepare rax to initialize registers with its value in the next run.
951 __ movq(rax, Operand(rbp, kInputStartMinusOne));
952
953 // Special case for zero-length matches.
954 // rdx: capture start index
955 __ cmpq(rdi, rdx);
956 // Not a zero-length match, restart.
957 __ j(not_equal, &restart);
958 // rdi (offset from the end) is zero if we already reached the end.
959 __ testq(rdi, rdi);
960 __ j(zero, &exit_label_, Label::kNear);
961 // Advance current position after a zero-length match.
962 if (mode_ == UC16) {
963 __ addq(rdi, Immediate(2));
964 } else {
965 __ incq(rdi);
966 }
967 __ jmp(&restart);
968 } else {
969 __ movq(rax, Immediate(SUCCESS));
970 }
913 } 971 }
914 972
915 // Exit and return rax
916 __ bind(&exit_label_); 973 __ bind(&exit_label_);
974 if (global()) {
975 // Return the number of successful captures.
976 __ movq(rax, Operand(rbp, kSuccessfulCaptures));
977 }
917 978
979 __ bind(&return_rax);
918 #ifdef _WIN64 980 #ifdef _WIN64
919 // Restore callee save registers. 981 // Restore callee save registers.
920 __ lea(rsp, Operand(rbp, kLastCalleeSaveRegister)); 982 __ lea(rsp, Operand(rbp, kLastCalleeSaveRegister));
921 __ pop(rbx); 983 __ pop(rbx);
922 __ pop(rdi); 984 __ pop(rdi);
923 __ pop(rsi); 985 __ pop(rsi);
924 // Stack now at rbp. 986 // Stack now at rbp.
925 #else 987 #else
926 // Restore callee save register. 988 // Restore callee save register.
927 __ movq(rbx, Operand(rbp, kBackup_rbx)); 989 __ movq(rbx, Operand(rbp, kBackup_rbx));
(...skipping 16 matching lines...) Expand all
944 if (check_preempt_label_.is_linked()) { 1006 if (check_preempt_label_.is_linked()) {
945 SafeCallTarget(&check_preempt_label_); 1007 SafeCallTarget(&check_preempt_label_);
946 1008
947 __ push(backtrack_stackpointer()); 1009 __ push(backtrack_stackpointer());
948 __ push(rdi); 1010 __ push(rdi);
949 1011
950 CallCheckStackGuardState(); 1012 CallCheckStackGuardState();
951 __ testq(rax, rax); 1013 __ testq(rax, rax);
952 // If returning non-zero, we should end execution with the given 1014 // If returning non-zero, we should end execution with the given
953 // result as return value. 1015 // result as return value.
954 __ j(not_zero, &exit_label_); 1016 __ j(not_zero, &return_rax);
955 1017
956 // Restore registers. 1018 // Restore registers.
957 __ Move(code_object_pointer(), masm_.CodeObject()); 1019 __ Move(code_object_pointer(), masm_.CodeObject());
958 __ pop(rdi); 1020 __ pop(rdi);
959 __ pop(backtrack_stackpointer()); 1021 __ pop(backtrack_stackpointer());
960 // String might have moved: Reload esi from frame. 1022 // String might have moved: Reload esi from frame.
961 __ movq(rsi, Operand(rbp, kInputEnd)); 1023 __ movq(rsi, Operand(rbp, kInputEnd));
962 SafeReturn(); 1024 SafeReturn();
963 } 1025 }
964 1026
(...skipping 40 matching lines...) Expand 10 before | Expand all | Expand 10 after
1005 __ pop(rsi); 1067 __ pop(rsi);
1006 #endif 1068 #endif
1007 SafeReturn(); 1069 SafeReturn();
1008 } 1070 }
1009 1071
1010 if (exit_with_exception.is_linked()) { 1072 if (exit_with_exception.is_linked()) {
1011 // If any of the code above needed to exit with an exception. 1073 // If any of the code above needed to exit with an exception.
1012 __ bind(&exit_with_exception); 1074 __ bind(&exit_with_exception);
1013 // Exit with Result EXCEPTION(-1) to signal thrown exception. 1075 // Exit with Result EXCEPTION(-1) to signal thrown exception.
1014 __ Set(rax, EXCEPTION); 1076 __ Set(rax, EXCEPTION);
1015 __ jmp(&exit_label_); 1077 __ jmp(&return_rax);
1016 } 1078 }
1017 1079
1018 FixupCodeRelativePositions(); 1080 FixupCodeRelativePositions();
1019 1081
1020 CodeDesc code_desc; 1082 CodeDesc code_desc;
1021 masm_.GetCode(&code_desc); 1083 masm_.GetCode(&code_desc);
1022 Isolate* isolate = ISOLATE; 1084 Isolate* isolate = ISOLATE;
1023 Handle<Code> code = isolate->factory()->NewCode( 1085 Handle<Code> code = isolate->factory()->NewCode(
1024 code_desc, Code::ComputeFlags(Code::REGEXP), 1086 code_desc, Code::ComputeFlags(Code::REGEXP),
1025 masm_.CodeObject()); 1087 masm_.CodeObject());
(...skipping 434 matching lines...) Expand 10 before | Expand all | Expand 10 after
1460 } 1522 }
1461 } 1523 }
1462 1524
1463 #undef __ 1525 #undef __
1464 1526
1465 #endif // V8_INTERPRETED_REGEXP 1527 #endif // V8_INTERPRETED_REGEXP
1466 1528
1467 }} // namespace v8::internal 1529 }} // namespace v8::internal
1468 1530
1469 #endif // V8_TARGET_ARCH_X64 1531 #endif // V8_TARGET_ARCH_X64
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698