Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(553)

Side by Side Diff: src/ia32/regexp-macro-assembler-ia32.cc

Issue 10386090: Implement loop for global regexps in regexp assembler. (Closed) Base URL: https://v8.googlecode.com/svn/branches/bleeding_edge
Patch Set: fix bugs, add tests, port to x64 and arm. Created 8 years, 7 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
OLDNEW
1 // Copyright 2011 the V8 project authors. All rights reserved. 1 // Copyright 2012 the V8 project authors. All rights reserved.
2 // Redistribution and use in source and binary forms, with or without 2 // Redistribution and use in source and binary forms, with or without
3 // modification, are permitted provided that the following conditions are 3 // modification, are permitted provided that the following conditions are
4 // met: 4 // met:
5 // 5 //
6 // * Redistributions of source code must retain the above copyright 6 // * Redistributions of source code must retain the above copyright
7 // notice, this list of conditions and the following disclaimer. 7 // notice, this list of conditions and the following disclaimer.
8 // * Redistributions in binary form must reproduce the above 8 // * Redistributions in binary form must reproduce the above
9 // copyright notice, this list of conditions and the following 9 // copyright notice, this list of conditions and the following
10 // disclaimer in the documentation and/or other materials provided 10 // disclaimer in the documentation and/or other materials provided
11 // with the distribution. 11 // with the distribution.
(...skipping 42 matching lines...) Expand 10 before | Expand all | Expand 10 after
54 * 54 *
55 * The registers eax and ebx are free to use for computations. 55 * The registers eax and ebx are free to use for computations.
56 * 56 *
57 * Each call to a public method should retain this convention. 57 * Each call to a public method should retain this convention.
58 * The stack will have the following structure: 58 * The stack will have the following structure:
59 * - Isolate* isolate (Address of the current isolate) 59 * - Isolate* isolate (Address of the current isolate)
60 * - direct_call (if 1, direct call from JavaScript code, if 0 60 * - direct_call (if 1, direct call from JavaScript code, if 0
61 * call through the runtime system) 61 * call through the runtime system)
62 * - stack_area_base (High end of the memory area to use as 62 * - stack_area_base (High end of the memory area to use as
63 * backtracking stack) 63 * backtracking stack)
64 * - capture array size (may fit multiple sets of matches)
64 * - int* capture_array (int[num_saved_registers_], for output). 65 * - int* capture_array (int[num_saved_registers_], for output).
65 * - end of input (Address of end of string) 66 * - end of input (Address of end of string)
66 * - start of input (Address of first character in string) 67 * - start of input (Address of first character in string)
67 * - start index (character index of start) 68 * - start index (character index of start)
68 * - String* input_string (location of a handle containing the string) 69 * - String* input_string (location of a handle containing the string)
69 * --- frame alignment (if applicable) --- 70 * --- frame alignment (if applicable) ---
70 * - return address 71 * - return address
71 * ebp-> - old ebp 72 * ebp-> - old ebp
72 * - backup of caller esi 73 * - backup of caller esi
73 * - backup of caller edi 74 * - backup of caller edi
74 * - backup of caller ebx 75 * - backup of caller ebx
76 * - success counter (only useful for global regexp to count matches)
75 * - Offset of location before start of input (effectively character 77 * - Offset of location before start of input (effectively character
76 * position -1). Used to initialize capture registers to a non-position. 78 * position -1). Used to initialize capture registers to a non-position.
77 * - register 0 ebp[-4] (Only positions must be stored in the first 79 * - register 0 ebp[-4] (Only positions must be stored in the first
78 * - register 1 ebp[-8] num_saved_registers_ registers) 80 * - register 1 ebp[-8] num_saved_registers_ registers)
79 * - ... 81 * - ...
80 * 82 *
81 * The first num_saved_registers_ registers are initialized to point to 83 * The first num_saved_registers_ registers are initialized to point to
82 * "character -1" in the string (i.e., char_size() bytes before the first 84 * "character -1" in the string (i.e., char_size() bytes before the first
83 * character of the string). The remaining registers starts out as garbage. 85 * character of the string). The remaining registers starts out as garbage.
84 * 86 *
(...skipping 614 matching lines...) Expand 10 before | Expand all | Expand 10 after
699 return true; 701 return true;
700 } 702 }
701 // No custom implementation (yet): s(UC16), S(UC16). 703 // No custom implementation (yet): s(UC16), S(UC16).
702 default: 704 default:
703 return false; 705 return false;
704 } 706 }
705 } 707 }
706 708
707 709
708 void RegExpMacroAssemblerIA32::Fail() { 710 void RegExpMacroAssemblerIA32::Fail() {
709 ASSERT(FAILURE == 0); // Return value for failure is zero. 711 STATIC_ASSERT(FAILURE == 0); // Return value for failure is zero.
710 __ Set(eax, Immediate(0)); 712 if (!global()) {
713 __ Set(eax, Immediate(FAILURE));
714 }
711 __ jmp(&exit_label_); 715 __ jmp(&exit_label_);
712 } 716 }
713 717
714 718
715 Handle<HeapObject> RegExpMacroAssemblerIA32::GetCode(Handle<String> source) { 719 Handle<HeapObject> RegExpMacroAssemblerIA32::GetCode(Handle<String> source) {
720 Label return_eax, restart;
716 // Finalize code - write the entry point code now we know how many 721 // Finalize code - write the entry point code now we know how many
717 // registers we need. 722 // registers we need.
718 723
719 // Entry code: 724 // Entry code:
720 __ bind(&entry_label_); 725 __ bind(&entry_label_);
721 726
722 // Tell the system that we have a stack frame. Because the type is MANUAL, no 727 // Tell the system that we have a stack frame. Because the type is MANUAL, no
723 // code is generated. 728 // code is generated.
724 FrameScope scope(masm_, StackFrame::MANUAL); 729 FrameScope scope(masm_, StackFrame::MANUAL);
725 730
726 // Actually emit code to start a new stack frame. 731 // Actually emit code to start a new stack frame.
727 __ push(ebp); 732 __ push(ebp);
728 __ mov(ebp, esp); 733 __ mov(ebp, esp);
729 // Save callee-save registers. Order here should correspond to order of 734 // Save callee-save registers. Order here should correspond to order of
730 // kBackup_ebx etc. 735 // kBackup_ebx etc.
731 __ push(esi); 736 __ push(esi);
732 __ push(edi); 737 __ push(edi);
733 __ push(ebx); // Callee-save on MacOS. 738 __ push(ebx); // Callee-save on MacOS.
739 __ push(Immediate(0)); // Number of successful matches in a global regexp.
734 __ push(Immediate(0)); // Make room for "input start - 1" constant. 740 __ push(Immediate(0)); // Make room for "input start - 1" constant.
735 741
736 // Check if we have space on the stack for registers. 742 // Check if we have space on the stack for registers.
737 Label stack_limit_hit; 743 Label stack_limit_hit;
738 Label stack_ok; 744 Label stack_ok;
739 745
740 ExternalReference stack_limit = 746 ExternalReference stack_limit =
741 ExternalReference::address_of_stack_limit(masm_->isolate()); 747 ExternalReference::address_of_stack_limit(masm_->isolate());
742 __ mov(ecx, esp); 748 __ mov(ecx, esp);
743 __ sub(ecx, Operand::StaticVariable(stack_limit)); 749 __ sub(ecx, Operand::StaticVariable(stack_limit));
744 // Handle it if the stack pointer is already below the stack limit. 750 // Handle it if the stack pointer is already below the stack limit.
745 __ j(below_equal, &stack_limit_hit); 751 __ j(below_equal, &stack_limit_hit);
746 // Check if there is room for the variable number of registers above 752 // Check if there is room for the variable number of registers above
747 // the stack limit. 753 // the stack limit.
748 __ cmp(ecx, num_registers_ * kPointerSize); 754 __ cmp(ecx, num_registers_ * kPointerSize);
749 __ j(above_equal, &stack_ok); 755 __ j(above_equal, &stack_ok);
750 // Exit with OutOfMemory exception. There is not enough space on the stack 756 // Exit with OutOfMemory exception. There is not enough space on the stack
751 // for our working registers. 757 // for our working registers.
752 __ mov(eax, EXCEPTION); 758 __ mov(eax, EXCEPTION);
753 __ jmp(&exit_label_); 759 __ jmp(&return_eax);
754 760
755 __ bind(&stack_limit_hit); 761 __ bind(&stack_limit_hit);
756 CallCheckStackGuardState(ebx); 762 CallCheckStackGuardState(ebx);
757 __ or_(eax, eax); 763 __ or_(eax, eax);
758 // If returned value is non-zero, we exit with the returned value as result. 764 // If returned value is non-zero, we exit with the returned value as result.
759 __ j(not_zero, &exit_label_); 765 __ j(not_zero, &return_eax);
760 766
761 __ bind(&stack_ok); 767 __ bind(&stack_ok);
762 // Load start index for later use. 768 // Load start index for later use.
763 __ mov(ebx, Operand(ebp, kStartIndex)); 769 __ mov(ebx, Operand(ebp, kStartIndex));
764 770
765 // Allocate space on stack for registers. 771 // Allocate space on stack for registers.
766 __ sub(esp, Immediate(num_registers_ * kPointerSize)); 772 __ sub(esp, Immediate(num_registers_ * kPointerSize));
767 // Load string length. 773 // Load string length.
768 __ mov(esi, Operand(ebp, kInputEnd)); 774 __ mov(esi, Operand(ebp, kInputEnd));
769 // Load input position. 775 // Load input position.
770 __ mov(edi, Operand(ebp, kInputStart)); 776 __ mov(edi, Operand(ebp, kInputStart));
771 // Set up edi to be negative offset from string end. 777 // Set up edi to be negative offset from string end.
772 __ sub(edi, esi); 778 __ sub(edi, esi);
773 779
774 // Set eax to address of char before start of the string. 780 // Set eax to address of char before start of the string.
775 // (effectively string position -1). 781 // (effectively string position -1).
776 __ neg(ebx); 782 __ neg(ebx);
777 if (mode_ == UC16) { 783 if (mode_ == UC16) {
778 __ lea(eax, Operand(edi, ebx, times_2, -char_size())); 784 __ lea(eax, Operand(edi, ebx, times_2, -char_size()));
779 } else { 785 } else {
780 __ lea(eax, Operand(edi, ebx, times_1, -char_size())); 786 __ lea(eax, Operand(edi, ebx, times_1, -char_size()));
781 } 787 }
782 // Store this value in a local variable, for use when clearing 788 // Store this value in a local variable, for use when clearing
783 // position registers. 789 // position registers.
784 __ mov(Operand(ebp, kInputStartMinusOne), eax); 790 __ mov(Operand(ebp, kInputStartMinusOne), eax);
785 791
792 // Load previous char as initial value of current-character.
793 Label at_start, character_loaded;
794 __ cmp(Operand(ebp, kStartIndex), Immediate(0));
795 __ j(equal, &at_start, Label::kNear);
796 LoadCurrentCharacterUnchecked(-1, 1); // Load previous char.
797 __ jmp(&character_loaded, Label::kNear);
798 __ bind(&at_start);
799 __ mov(current_character(), '\n');
800
801 if (global()) {
802 // Initializing current-character already done, so skip it.
803 __ jmp(&character_loaded, Label::kNear);
804 // Restart matching from here in a global regexp.
805 __ bind(&restart);
806 // In a restarted pass, initialize current-character here.
807 LoadCurrentCharacterUnchecked(-1, 1);
808 }
809 __ bind(&character_loaded);
810
811 // Initialize on-stack registers.
786 if (num_saved_registers_ > 0) { // Always is, if generated from a regexp. 812 if (num_saved_registers_ > 0) { // Always is, if generated from a regexp.
787 // Fill saved registers with initial value = start offset - 1 813 // Fill saved registers with initial value = start offset - 1
788 // Fill in stack push order, to avoid accessing across an unwritten 814 // Fill in stack push order, to avoid accessing across an unwritten
789 // page (a problem on Windows). 815 // page (a problem on Windows).
790 __ mov(ecx, kRegisterZero); 816 if (num_saved_registers_ > 8) {
791 Label init_loop; 817 __ mov(ecx, kRegisterZero);
792 __ bind(&init_loop); 818 Label init_loop;
793 __ mov(Operand(ebp, ecx, times_1, +0), eax); 819 __ bind(&init_loop);
794 __ sub(ecx, Immediate(kPointerSize)); 820 __ mov(Operand(ebp, ecx, times_1, 0), eax);
795 __ cmp(ecx, kRegisterZero - num_saved_registers_ * kPointerSize); 821 __ sub(ecx, Immediate(kPointerSize));
796 __ j(greater, &init_loop); 822 __ cmp(ecx, kRegisterZero - num_saved_registers_ * kPointerSize);
823 __ j(greater, &init_loop);
824 } else { // Unroll the loop.
825 for (int i = 0; i < num_saved_registers_; i++) {
826 __ mov(register_location(i), eax);
827 }
828 }
797 } 829 }
798 // Ensure that we have written to each stack page, in order. Skipping a page
799 // on Windows can cause segmentation faults. Assuming page size is 4k.
800 const int kPageSize = 4096;
801 const int kRegistersPerPage = kPageSize / kPointerSize;
802 for (int i = num_saved_registers_ + kRegistersPerPage - 1;
803 i < num_registers_;
804 i += kRegistersPerPage) {
805 __ mov(register_location(i), eax); // One write every page.
806 }
807
808 830
809 // Initialize backtrack stack pointer. 831 // Initialize backtrack stack pointer.
810 __ mov(backtrack_stackpointer(), Operand(ebp, kStackHighEnd)); 832 __ mov(backtrack_stackpointer(), Operand(ebp, kStackHighEnd));
811 // Load previous char as initial value of current-character. 833
812 Label at_start;
813 __ cmp(Operand(ebp, kStartIndex), Immediate(0));
814 __ j(equal, &at_start);
815 LoadCurrentCharacterUnchecked(-1, 1); // Load previous char.
816 __ jmp(&start_label_); 834 __ jmp(&start_label_);
817 __ bind(&at_start);
818 __ mov(current_character(), '\n');
819 __ jmp(&start_label_);
820
821 835
822 // Exit code: 836 // Exit code:
823 if (success_label_.is_linked()) { 837 if (success_label_.is_linked()) {
824 // Save captures when successful. 838 // Save captures when successful.
825 __ bind(&success_label_); 839 __ bind(&success_label_);
826 if (num_saved_registers_ > 0) { 840 if (num_saved_registers_ > 0) {
827 // copy captures to output 841 // copy captures to output
828 __ mov(ebx, Operand(ebp, kRegisterOutput)); 842 __ mov(ebx, Operand(ebp, kRegisterOutput));
829 __ mov(ecx, Operand(ebp, kInputEnd)); 843 __ mov(ecx, Operand(ebp, kInputEnd));
830 __ mov(edx, Operand(ebp, kStartIndex)); 844 __ mov(edx, Operand(ebp, kStartIndex));
831 __ sub(ecx, Operand(ebp, kInputStart)); 845 __ sub(ecx, Operand(ebp, kInputStart));
832 if (mode_ == UC16) { 846 if (mode_ == UC16) {
833 __ lea(ecx, Operand(ecx, edx, times_2, 0)); 847 __ lea(ecx, Operand(ecx, edx, times_2, 0));
834 } else { 848 } else {
835 __ add(ecx, edx); 849 __ add(ecx, edx);
836 } 850 }
837 for (int i = 0; i < num_saved_registers_; i++) { 851 for (int i = 0; i < num_saved_registers_; i++) {
838 __ mov(eax, register_location(i)); 852 __ mov(eax, register_location(i));
853 if (i == 0 && global()) {
854 // Keep capture start in edx for the zero-length check later.
855 __ mov(edx, eax);
856 }
839 // Convert to index from start of string, not end. 857 // Convert to index from start of string, not end.
840 __ add(eax, ecx); 858 __ add(eax, ecx);
841 if (mode_ == UC16) { 859 if (mode_ == UC16) {
842 __ sar(eax, 1); // Convert byte index to character index. 860 __ sar(eax, 1); // Convert byte index to character index.
843 } 861 }
844 __ mov(Operand(ebx, i * kPointerSize), eax); 862 __ mov(Operand(ebx, i * kPointerSize), eax);
845 } 863 }
846 } 864 }
847 __ mov(eax, Immediate(SUCCESS)); 865
866 if (global()) {
867 // Restart matching if the regular expression is flagged as global.
868 // Increment success counter.
869 __ inc(Operand(ebp, kSuccessfulCaptures));
870 // Capture results have been stored, so the number of remaining global
871 // output registers is reduced by the number of stored captures.
872 __ mov(ecx, Operand(ebp, kNumOutputRegisters));
873 __ sub(ecx, Immediate(num_saved_registers_));
874 // Check whether we have enough room for another set of capture results.
875 __ cmp(ecx, Immediate(num_saved_registers_));
876 __ j(less, &exit_label_);
877
878 __ mov(Operand(ebp, kNumOutputRegisters), ecx);
879 // Advance the location for output.
880 __ add(Operand(ebp, kRegisterOutput),
881 Immediate(num_saved_registers_ * kPointerSize));
882
883 // Prepare eax to initialize registers with its value in the next run.
884 __ mov(eax, Operand(ebp, kInputStartMinusOne));
885
886 // Special case for zero-length matches.
887 // edx: capture start index
888 __ cmp(edi, edx);
889 // Not a zero-length match, restart.
890 __ j(not_equal, &restart);
891 // edi (offset from the end) is zero if we already reached the end.
892 __ test(edi, edi);
893 __ j(zero, &exit_label_, Label::kNear);
894 // Advance current position after a zero-length match.
895 if (mode_ == UC16) {
896 __ add(edi, Immediate(2));
897 } else {
898 __ inc(edi);
899 }
900 __ jmp(&restart);
901 } else {
902 __ mov(eax, Immediate(SUCCESS));
903 }
848 } 904 }
849 // Exit and return eax 905
850 __ bind(&exit_label_); 906 __ bind(&exit_label_);
907 if (global()) {
908 // Return the number of successful captures.
909 __ mov(eax, Operand(ebp, kSuccessfulCaptures));
910 }
911
912 __ bind(&return_eax);
851 // Skip esp past regexp registers. 913 // Skip esp past regexp registers.
852 __ lea(esp, Operand(ebp, kBackup_ebx)); 914 __ lea(esp, Operand(ebp, kBackup_ebx));
853 // Restore callee-save registers. 915 // Restore callee-save registers.
854 __ pop(ebx); 916 __ pop(ebx);
855 __ pop(edi); 917 __ pop(edi);
856 __ pop(esi); 918 __ pop(esi);
857 // Exit function frame, restore previous one. 919 // Exit function frame, restore previous one.
858 __ pop(ebp); 920 __ pop(ebp);
859 __ ret(0); 921 __ ret(0);
860 922
861 // Backtrack code (branch target for conditional backtracks). 923 // Backtrack code (branch target for conditional backtracks).
862 if (backtrack_label_.is_linked()) { 924 if (backtrack_label_.is_linked()) {
863 __ bind(&backtrack_label_); 925 __ bind(&backtrack_label_);
864 Backtrack(); 926 Backtrack();
865 } 927 }
866 928
867 Label exit_with_exception; 929 Label exit_with_exception;
868 930
869 // Preempt-code 931 // Preempt-code
870 if (check_preempt_label_.is_linked()) { 932 if (check_preempt_label_.is_linked()) {
871 SafeCallTarget(&check_preempt_label_); 933 SafeCallTarget(&check_preempt_label_);
872 934
873 __ push(backtrack_stackpointer()); 935 __ push(backtrack_stackpointer());
874 __ push(edi); 936 __ push(edi);
875 937
876 CallCheckStackGuardState(ebx); 938 CallCheckStackGuardState(ebx);
877 __ or_(eax, eax); 939 __ or_(eax, eax);
878 // If returning non-zero, we should end execution with the given 940 // If returning non-zero, we should end execution with the given
879 // result as return value. 941 // result as return value.
880 __ j(not_zero, &exit_label_); 942 __ j(not_zero, &return_eax);
881 943
882 __ pop(edi); 944 __ pop(edi);
883 __ pop(backtrack_stackpointer()); 945 __ pop(backtrack_stackpointer());
884 // String might have moved: Reload esi from frame. 946 // String might have moved: Reload esi from frame.
885 __ mov(esi, Operand(ebp, kInputEnd)); 947 __ mov(esi, Operand(ebp, kInputEnd));
886 SafeReturn(); 948 SafeReturn();
887 } 949 }
888 950
889 // Backtrack stack overflow code. 951 // Backtrack stack overflow code.
890 if (stack_overflow_label_.is_linked()) { 952 if (stack_overflow_label_.is_linked()) {
(...skipping 26 matching lines...) Expand all
917 __ pop(edi); 979 __ pop(edi);
918 __ pop(esi); 980 __ pop(esi);
919 SafeReturn(); 981 SafeReturn();
920 } 982 }
921 983
922 if (exit_with_exception.is_linked()) { 984 if (exit_with_exception.is_linked()) {
923 // If any of the code above needed to exit with an exception. 985 // If any of the code above needed to exit with an exception.
924 __ bind(&exit_with_exception); 986 __ bind(&exit_with_exception);
925 // Exit with Result EXCEPTION(-1) to signal thrown exception. 987 // Exit with Result EXCEPTION(-1) to signal thrown exception.
926 __ mov(eax, EXCEPTION); 988 __ mov(eax, EXCEPTION);
927 __ jmp(&exit_label_); 989 __ jmp(&return_eax);
928 } 990 }
929 991
930 CodeDesc code_desc; 992 CodeDesc code_desc;
931 masm_->GetCode(&code_desc); 993 masm_->GetCode(&code_desc);
932 Handle<Code> code = 994 Handle<Code> code =
933 masm_->isolate()->factory()->NewCode(code_desc, 995 masm_->isolate()->factory()->NewCode(code_desc,
934 Code::ComputeFlags(Code::REGEXP), 996 Code::ComputeFlags(Code::REGEXP),
935 masm_->CodeObject()); 997 masm_->CodeObject());
936 PROFILE(masm_->isolate(), RegExpCodeCreateEvent(*code, *source)); 998 PROFILE(masm_->isolate(), RegExpCodeCreateEvent(*code, *source));
937 return Handle<HeapObject>::cast(code); 999 return Handle<HeapObject>::cast(code);
(...skipping 390 matching lines...) Expand 10 before | Expand all | Expand 10 after
1328 } 1390 }
1329 1391
1330 1392
1331 #undef __ 1393 #undef __
1332 1394
1333 #endif // V8_INTERPRETED_REGEXP 1395 #endif // V8_INTERPRETED_REGEXP
1334 1396
1335 }} // namespace v8::internal 1397 }} // namespace v8::internal
1336 1398
1337 #endif // V8_TARGET_ARCH_IA32 1399 #endif // V8_TARGET_ARCH_IA32
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698