TL;DR : Newly Added IntPtrConstant Node in Maglev Leads to Maglev Node Confusion
https://issues.chromium.org/issues/410136467 Report 提到 Commit da075df0ad2a3c0ff8a1db389704650f4c1cb648 added a new Constant node: IntPtrConstant, which is used for constant folding of TyperArray.length, indicating the length of TyperArray.
Analyzing IntPtr commit https://chromium.googlesource.com/v8/v8/+/da075df0ad2a3c0ff8a1db389704650f4c1cb648
Float64Array 是一種 TypedArray TypedArray 的大小要在宣告時指定,之後不能改https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Float64Array
這些 turbolev-graph-builder.cc 中的 Process 都是用在把 Maglev IR 轉成 turboshaft IR 的地方
1 2 3 4 5 6 7 8 9 10 RunMaglevOptimizations (data, compilation_info.get (), maglev_graph); data->InitializeGraphComponent (nullptr ); std::optional<BailoutReason> bailout; maglev::GraphProcessor<NodeProcessorBase> builder ( data, data->graph(), temp_zone, compilation_info->toplevel_compilation_unit(), &bailout) ; builder.ProcessGraph (maglev_graph);
看幾個例子 V<node> 是 Turboshaft IR
Example 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 maglev::ProcessResult Process (maglev::IntPtrConstant* node, const maglev::ProcessingState& state) { SetMap (node, __ WordPtrConstant (node->value ())); return maglev::ProcessResult::kContinue; } void SetMap (maglev::NodeBase* node, V<Any> idx) { [...] node_mapping_[node] = idx; } V<WordPtr> WordPtrConstant (uintptr_t value) { return V<WordPtr>::Cast (WordConstant (value, WordRepresentation::WordPtr ())); }
Example2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 maglev::ProcessResult Process (maglev::LoadUnsignedIntTypedArrayElement* node, const maglev::ProcessingState& state) { if (node->has_off_heap_constant_typed_array ()) { SetMap (node, BuildConstantTypedArrayLoad (node->constant_typed_array (), Map <Word32>(node->index_input ()), node->elements_kind ())); } else { SetMap (node, BuildTypedArrayLoad (Map <JSTypedArray>(node->object_input ()), Map <Word32>(node->index_input ()), node->elements_kind ())); } return maglev::ProcessResult::kContinue; } V<Untagged> BuildTypedArrayLoad (V<JSTypedArray> typed_array, V<Word32> index, ElementsKind kind) { auto [data_pointer, base_pointer] = GetTypedArrayDataAndBasePointers (typed_array); return __ LoadTypedElement (typed_array, base_pointer, data_pointer, __ ChangeUint32ToUintPtr (index), GetArrayTypeFromElementsKind (kind)); }
Optimizing Property Load of TypedArray::Length 條件符合時直接建立一個 IntPtrConstant Node 原先需要先計算 length 的 offset,再從 object 上取值 現在可以直接用這個 Node 取值來加速
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 ReduceResult MaglevGraphBuilder::BuildLoadTypedArrayLength ( ValueNode* object, ElementsKind elements_kind) { DCHECK (IsTypedArrayOrRabGsabTypedArrayElementsKind (elements_kind)); bool is_variable_length = IsRabGsabTypedArrayElementsKind (elements_kind); if (!is_variable_length) { if (auto const_object = object->TryGetConstant (broker ())) { if (const_object->IsJSTypedArray ()) { auto const_typed_array = const_object->AsJSTypedArray (); if (!const_typed_array.is_on_heap () && !IsRabGsabTypedArrayElementsKind ( const_typed_array.elements_kind (broker ()))) { size_t length = const_typed_array.length (broker ()); static_assert (ArrayBuffer::kMaxByteLength <= std::numeric_limits<intptr_t >::max ()); return GetIntPtrConstant (static_cast <intptr_t >(length)); } } } if (ValueNode* length = known_node_aspects ().TryFindLoadedConstantProperty ( object, KnownNodeAspects::LoadedPropertyMapKey::TypedArrayLength ())) { return length; } } ValueNode* result = AddNewNode <LoadTypedArrayLength>({object}, elements_kind); if (!is_variable_length) { RecordKnownProperty ( object, KnownNodeAspects::LoadedPropertyMapKey::TypedArrayLength (), result, true , compiler::AccessMode::kLoad); } return result; }
看完檢查的邏輯後我想確認 maglev 中的 constant if (auto const_object = object->TryGetConstant(broker()))
是什麼,所以做了底下的小實驗 這個會通過檢查進到 GetIntPtrConstant ,嘗試 reuse 或建立 IntPtrConstant Node
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 const ta = new Uint8Array ([1 , 2 , 3 , 4 , 5 ]);function foo ( ) { const len = ta.length ; let sum = 0 ; for (let i = 0 ; i < len; i++) { sum += ta[i]; } return sum / len; } %PrepareFunctionForOptimization (foo); for (let i = 0 ; i < 10_000 ; ++i) foo ();%OptimizeMaglevOnNextCall (foo); foo ();
但底下這個不會,推測是 Maglev 無法判斷 arr 會不會變動所以沒把他當 constant(因為他是 parameter)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 function processTypedArray (arr ) { const len = arr.length ; let sum = 0 ; for (let i = 0 ; i < len; i++) { sum += arr[i]; } return sum / arr.length ; } const array = new Uint8Array ([1 , 2 , 3 , 4 , 5 ]);for (let i = 0 ; i < 10000 ; i++) { processTypedArray (array); }
Patch Analysis https://chromium-review.googlesource.com/c/v8/v8/+/6455743
在 MaglevAssembler::MaterialiseValueNode
增加了對 IntPtrConstant 的 case 檢查 看到這邊的猜測是踩到 DCHECK(!value->allocation().IsConstant())
導致 crash,但在 release 版沒有 DCHECK 所以可以走到後面導致 Node Confusion
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 void MaglevAssembler::MaterialiseValueNode (Register dst, ValueNode* value) { switch (value->opcode ()) { case Opcode::kInt32Constant: { int32_t int_value = value->Cast <Int32Constant>()->value (); if (Smi::IsValid (int_value)) { Move (dst, Smi::FromInt (int_value)); } else { MoveHeapNumber (dst, int_value); } return ; } case Opcode::kIntPtrConstant: { intptr_t intptr_value = value->Cast <IntPtrConstant>()->value (); if (intptr_value <= std::numeric_limits<int >::max () && Smi::IsValid (static_cast <int >(intptr_value))) { Move (dst, Smi::FromInt (static_cast <int >(intptr_value))); } else { MoveHeapNumber (dst, intptr_value); } return ; } case Opcode::kUint32Constant: { uint32_t uint_value = value->Cast <Uint32Constant>()->value (); if (Smi::IsValid (uint_value)) { Move (dst, Smi::FromInt (uint_value)); } else { MoveHeapNumber (dst, uint_value); } return ; } case Opcode::kFloat64Constant: { double double_value = value->Cast <Float64Constant>()->value ().get_scalar (); int smi_value; if (DoubleToSmiInteger (double_value, &smi_value)) { Move (dst, Smi::FromInt (smi_value)); } else { MoveHeapNumber (dst, double_value); } return ; } default : break ; } DCHECK (!value->allocation ().IsConstant ()); DCHECK (value->allocation ().IsAnyStackSlot ()); using D = NewHeapNumberDescriptor; DoubleRegister builtin_input_value = D::GetDoubleRegisterParameter (D::kValue); MemOperand src = ToMemOperand (value->allocation ()); switch (value->properties ().value_representation ()) { case ValueRepresentation::kInt32: { Label done; TemporaryRegisterScope temps (this ) ; Register scratch = temps.AcquireScratch (); Move (scratch, src); SmiTagInt32AndJumpIfSuccess (dst, scratch, &done, Label::kNear); Int32ToDouble (builtin_input_value, scratch); CallBuiltin <Builtin::kNewHeapNumber>(builtin_input_value); Move (dst, kReturnRegister0); bind (&done); break ; } case ValueRepresentation::kUint32: { Label done; TemporaryRegisterScope temps (this ) ; Register scratch = temps.AcquireScratch (); Move (scratch, src); SmiTagUint32AndJumpIfSuccess (dst, scratch, &done, Label::kNear); Uint32ToDouble (builtin_input_value, scratch); CallBuiltin <Builtin::kNewHeapNumber>(builtin_input_value); Move (dst, kReturnRegister0); bind (&done); break ; } case ValueRepresentation::kFloat64: LoadFloat64 (builtin_input_value, src); CallBuiltin <Builtin::kNewHeapNumber>(builtin_input_value); Move (dst, kReturnRegister0); break ; case ValueRepresentation::kHoleyFloat64: { Label done, box; JumpIfNotHoleNan (src, &box, Label::kNear); LoadRoot (dst, RootIndex::kUndefinedValue); Jump (&done); bind (&box); LoadFloat64 (builtin_input_value, src); CallBuiltin <Builtin::kNewHeapNumber>(builtin_input_value); Move (dst, kReturnRegister0); bind (&done); break ; } case ValueRepresentation::kIntPtr: { Label done; TemporaryRegisterScope temps (this ) ; Register scratch = temps.AcquireScratch (); Move (scratch, src); SmiTagIntPtrAndJumpIfSuccess (dst, scratch, &done, Label::kNear); IntPtrToDouble (builtin_input_value, scratch); CallBuiltin <Builtin::kNewHeapNumber>(builtin_input_value); Move (dst, kReturnRegister0); bind (&done); break ; } case ValueRepresentation::kTagged: UNREACHABLE (); } }
MaglevAssembler::MaterialiseValueNode 只有在 maglev-code-generator 中的 ExceptionHandlerTrampolineBuilder Class 中被使用,對應到的 Method 是 EmitMaterialisationsAndPushResults
和 EmitPopMaterialisedResults
。 可以發現它會檢查 if (IsConstantNode(move.source->opcode()))
來決定要不要呼叫 MaterialiseValueNode()
,所以只要看 EmitPopMaterialisedResults
追完後會發現 call chain 會長這樣(上面可能還有一些東西但不用看)MaglevCodeGenerator::Assemble()
->MaglevCodeGenerator::EmitCode()
->MaglevCodeGenerator::EmitExceptionHandlerTrampolines()
->ExceptionHandlerTrampolineBuilder::Build()
->ExceptionHandlerTrampolineBuilder::EmitTrampolineFor()
->ExceptionHandlerTrampolineBuilder::EmitPopMaterialisedResults()
->MaglevAssembler::MaterialiseValueNode()
所以猜測在 try-catch 中 access 是 constant 的 TypedArray.length 就能觸發,寫了以下的 PoC 但它不會進 MaglevAssembler::MaterialiseValueNode() 看產出來的圖發現 exceptional handler block 不會有跟 a1 有關的東西 可能是 Maglev 認為 Exceptional handler 不需要 a1 的資訊?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 const f64Arr = new Float64Array ();function opt_me ( ) { try { f64Arr.length ; throw "err" ; } catch (e) { console .log (e); } } %PrepareFunctionForOptimization (opt_me); opt_me ();%OptimizeMaglevOnNextCall (opt_me); opt_me ();
想不到要怎麼把 a1 帶進 exceptional handler 的 basic block 所以去翻作者的 PoC 看起來是用 fuzzer 產的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 const f64Arr = new Float64Array ();f64Arr.buffer ; function opt_me (a0 ) { for (let i = 0 ; i < 50 ; i++) { } try { a0[0 ]; a0 = f64Arr.length ; throw "err" ; } catch (e) { console .log (e); } } %PrepareFunctionForOptimization (opt_me); opt_me (f64Arr);opt_me (f64Arr);%OptimizeMaglevOnNextCall (opt_me); opt_me ();
拿來改了一下然後印出 Maglev IR graph (d8 –print-maglev-graphs poc.js) 發現它多了 23/17: φᵀₑ a0 (compressed) → [rcx|R|t] (spilled: [stack:0|t]), live range: [23-33]
看起來是要把 a0 帶進去 exceptional handler(a0[0] 會多一個 )throw 讓 exception phi 出現
ModifiedPoC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 function opt_me (a0 ) { try { const f64Arr = new Float64Array (); a0[0 ]; a0 = f64Arr.length ; throw "err" ; } catch (e) { console .log (e); } } %PrepareFunctionForOptimization (opt_me); opt_me (f64Arr);%OptimizeMaglevOnNextCall (opt_me); opt_me ();
後續會用 MemOperand src = ToMemOperand(value->allocation())
導致 node confusion 應該是有某種原因導致這邊不能是 Constant,所以才做 switch 檢查 Constant 並額外處理 有試著找其他 Varient 但沒找到,所以沒有往下追 constant / non-constant node confusion 後會發生什麼和能不能 Exploit 追最新 commit 有看到類似的情境的話會再回來研究這部分