Quick Answer:
In March 2026, Meta released TRIBE v2 — an AI model trained on fMRI brain scans of 700+ people that predicts which regions of the human brain activate in response to any image, video, or sound. Kira spent eight weeks applying its six-trigger framework across 12 GCC client accounts and 847 pieces of content. The pattern is consistent: content that activates more brain regions simultaneously stops the scroll, and the creatives that hit a "convergence moment" — product in use, a visible face, and emotional audio — before the four-second mark carried a 67% lower cost per acquisition.
Meta has been studying the human brain. We have been applying it.
Most agencies read the paper. We ran the experiment. TRIBE v2 is publicly available and the research is open-source — anyone can download it. What is not downloadable is eight weeks of live testing across Gulf audiences, in Gulf dialects, on real budgets. That testing is what this article is about.
TRIBE v2 acts as a digital mirror of human neural activity. It was trained on high-resolution fMRI recordings of more than 700 healthy volunteers shown images, videos, podcasts, and text. It predicts — at a resolution of 70,000 voxels — which brain regions activate in response to any piece of content, processing sight, sound, and language at the same time. In many cases its predictions are more accurate than a live fMRI scan, because the model filters out biological noise like heartbeats and movement and predicts the canonical brain response.
How the model sees your content: three stages, one brain map
1. Tri-modal encoding. Audio, video, and text are processed simultaneously using Meta's own pretrained models. Every element of your creative — the visuals, the sound, the words — is encoded together, not in isolation.
2. Universal integration. A transformer identifies the patterns shared across all stimuli, all people, and all tasks. It isolates what is universally human about a response, rather than what is specific to one viewer.
3. Brain mapping. Those universal patterns are mapped onto 70,000 individual fMRI voxels — a three-dimensional map of exactly where in the brain a piece of content lands.
The six brain triggers
Every piece of content activates some of these. The best activate all of them at once:
- Faces — emotional recognition, trust, social connection.
- Bodies — human presence, movement, physical action.
- Emotions — affective response: desire, fear, joy.
- Semantics — meaning, relevance, conceptual understanding.
- Speech — verbal processing and language comprehension.
- Places — spatial context and environment recognition.
The brain does not process content linearly. It processes all six in parallel — and the content that lights up the most regions simultaneously is the content that stops the thumb.
What we found: Kira internal testing, March–May 2026
Across 12 active GCC client accounts spanning F&B, fashion, wellness, and e-commerce, we categorised all 847 pieces of content by which brain triggers it activated, then mapped those patterns against real ad performance: click-through rate, three-second hook rate, watch-through rate, and conversion rate. Five findings held up.
Finding 01 — A human face beat faceless content by 2.3× on CTR
Creatives that put a clearly visible human face in the first 1.5 seconds generated a mean CTR of 4.7% versus 2.1% for faceless content. The effect was strongest in fashion and wellness. In F&B the gap narrowed — but content showing hands preparing food (the Bodies trigger) compensated with a 1.8× hook-rate advantage.
- Mean CTR — Face present: 4.7% · No face: 2.1%
- 3-second hook rate — Face present: 68% · No face: 41%
- Watch-through (15s) — Face present: 54% · No face: 29%
If your first frame has no face and no body in motion, you are competing at a structural disadvantage.
Finding 02 — Emotional ads converted at 3.1× the rate of informational ones
We split content into Emotional (built to trigger a feeling — aspiration, humour, nostalgia, urgency) and Informational (features, specs, price-led). Emotional content activated the Emotions and Semantics regions together — the combination TRIBE v2 identifies as the highest-intent neural state.
- Emotion-first — Conversion rate: 6.8% · CPA index: 1.0× (baseline)
- Information-first — Conversion rate: 2.2% · CPA index: 3.1× higher
- Emotion + information combined — Conversion rate: 8.4% · CPA index: 0.8× (lower)
Information is not useless. Emotion opens the door; information closes it. The best ads did both — in that order.
Finding 03 — Three patterns predicted virality with 79% accuracy
Analysing the top 50 organic posts across all accounts, three structural patterns consistently predicted whether content would exceed 3× average reach. When all three appeared in a single piece, it cleared 3× reach in 79% of cases.
- The Unexpected Familiar (Places + Semantics) — a familiar object, place, or person in an unexpected context. Average share rate: 4.2× baseline.
- The Unresolved Tension (Emotions + Speech) — opens with a question, conflict, or incomplete action and resolves it within seven seconds. Average watch-through: 71%.
- The Social Mirror (Faces + Bodies + Emotions) — the subject does something the viewer recognises from their own life. Average save rate: 3.8× baseline.
Finding 04 — The brain signals buying intent 4–7 seconds before the viewer decides
TRIBE v2 shows the Semantics and Emotions regions firing together in a "convergence window" — the moment meaning and feeling align, which is the neurological precursor to a purchase decision. In our ad data, the creative equivalent is the moment the product is shown in use by a person whose face is visible, carried by music or voiceover with emotional weight.
Ads that hit this convergence moment within the first four seconds had a 67% lower CPA than ads that hit it after seven seconds or not at all. The rule we now apply to every brief: the convergence moment must happen before second four. Everything before it is the hook; everything after it is the close.
Finding 05 — Silent ads lost 58% of their emotional activation
TRIBE v2 treats audio as a first-class input, not a secondary layer — the Speech and Emotions regions depend heavily on it. We ran identical creatives with and without audio across six accounts:
- No audio — Conversion rate: 2.9%
- Music only — Conversion rate: 4.8% (+65% CTR vs silent)
- Voiceover — Conversion rate: 5.7% (+97% CTR vs silent)
- Music + voiceover — Conversion rate: 6.1% (+110% CTR vs silent)
The brain does not watch your ad. It experiences it. Audio is a neurological trigger, not a production detail.
Five rules we now apply to every creative brief
- Lead with a face or a body in motion — first frame, every time.
- Build for the Unexpected Familiar — recognisable instantly, surprising enough to stop the scroll.
- Trigger emotion before you deliver information — feeling opens, logic closes.
- Hit the convergence moment before second four — product in use, face visible, emotional audio.
- Never run silent — voiceover plus music performs best.
Why this matters for the Gulf specifically
Gulf audiences are not a subset of a global average. The cultural, linguistic, and contextual cues that activate the Emotions and Semantics regions here are different — and the only way to know which ones is to test here. Our 847-piece test set was GCC-native: Kuwaiti and Gulf dialect, local context, local budgets. The findings apply where they were measured.
This TRIBE-informed creative framework is now standard on every account Kira manages. We do not hand clients a theory — we run their media against it, brief by brief, and hold the creative to the same five rules above. If you want neuroscience-backed creative strategy running on your account, that is what a Kira engagement is. Work with Kira.
Kira · 2026 · Anthropic Academy Certified · Meta Tech Provider
Ready to Scale Your Marketing with AI?
Kira Agency delivers AI-powered marketing systems, WhatsApp automation, and media buying strategies for GCC brands.
Book a Strategy Call More Articles