parameterize the image encoder as f
i
q
_{iq}
iq
query feature q
i
i
_{ii}
ii,key feature k
i
i
_{ii}
ii
parameterize the textual encoder as
f
c
q
(
⋅
;
Θ
q
,
Φ
c
q
)
f_{cq}(·; Θ_q, Φ_{cq})
fcq(⋅;Θq,Φcq),momentum textual encoder as
f
c
k
(
⋅
;
Θ
k
,
Φ
i
k
)
f_{ck}(·; Θ_k, Φ_{ik})
fck(⋅;Θk,Φik).
c
j
†
c^†_j
cj†和
c
j
⋆
c^\star_j
cj⋆是different augmented examples
吐槽
第一张图字母下标被黑色背景盖住了,且作者不公布代码,不该是CVPR的“水平”