GDELT Media Tone Index¶
This page explains how InvestLens retrieves GDELT media tone, converts it into a 30-day standardized Media Tone Index (z-score), and visualizes it.
1. What we get from GDELT¶
We request two daily timeline series for the same query and date range:
-
TimelineTone (daily tone)
- Returns a daily “Average Tone” value for the query’s matching news coverage (the average tone across all matching articles within that day’s bucket).
- In our data model, this becomes
tone_tfor dayt.
-
TimelineVolRaw (daily volume)
- Returns the raw number of matching articles per day,
count_t.
- Returns the raw number of matching articles per day,
2. What “tone” means in this context¶
2.1 GDELT “Average Tone”¶
GDELT’s “Average Tone” is a daily aggregate tone score computed from the language in matched news coverage for the query.
- Positive values indicate more positive language on average.
- Negative values indicate more negative language on average.
- Values closer to 0 indicate more neutral or balanced language on average.
InvestLens uses the provider’s tone values as received and standardizes them over a rolling window.
3. How we compute the 30-day z-score series¶
We compute z-scores over a fixed 30-day UTC window per query.
3.1 Build the merged daily series¶
For each day t in the window, we merge:
tone_tfrom TimelineTonecount_tfrom TimelineVolRaw
3.2 Choose which days are used for the baseline statistics¶
- A day with
count_t = 0is treated as “no measurement,” not “neutral tone.” - Baseline statistics are computed using only days with
count_t > 0.
3.3 Compute baseline mean and standard deviation (volume-aware)¶
Because TimelineTone is a daily average, we apply volume-aware weighting so higher-volume days influence the baseline more.
We define weights:
We compute the volume-weighted mean:
We compute the volume-weighted standard deviation (population-style):
If total weight is effectively zero, we fall back to an unweighted mean and standard deviation over valid days (count_t > 0).
3.4 Compute daily z-scores¶
For each day with a valid measurement (count_t > 0):
- If \(\sigma = 0\), we return \(z_t = 0\).
- For days where
count_t = 0, we returntone = nullandz = nullso the frontend treats them as missing.
3.5 What “baseline” means¶
“Baseline” refers to the reference distribution used to standardize tone within the last 30 days (using only measured days):
- baseline mean: \(\mu\)
- baseline standard deviation: \(\sigma\)
A value of \(z_t = 0\) means \(\text{tone}_t = \mu\) for the window (volume-weighted when weighting is active).
4. What we visualize in the UI¶
InvestLens renders a range band and a “latest” marker. It does not plot the full daily time series in this widget.
4.1 Bar coordinate system and the z = 0 reference¶
The UI maps z-scores onto a 0–100 bar using a fixed transformation with clamping:
- \(z = 0\) is always mapped to the center of the bar (50%).
- Negative z values map left, positive z values map right.
- Z-scores are clamped (e.g., to [-3, 3]) so extreme values stay on the bar.
This means the center of the bar is always the “baseline mean” reference point in z-score space (because z is defined as tone minus mean, divided by standard deviation).
4.2 Range band¶
The shaded band is computed from the observed z-scores in the 30-day window:
- left edge = minimum observed z in the window (after display clamping and mapping)
- right edge = maximum observed z in the window (after display clamping and mapping)
4.3 Latest marker¶
- The marker is the most recent day’s z-score if it exists.
- If the most recent day has
z = null(no measurement), the frontend should use the most recent earlier day with a valid z-score.
5. Interpretation¶
- Marker near the center (z close to 0): today’s measured tone is close to the 30-day baseline mean.
- Marker far left (large negative z): unusually negative relative to the last 30 measured days.
- Marker far right (large positive z): unusually positive relative to the last 30 measured days.
6. Limitations¶
6.1 Query ambiguity¶
Ticker/name queries can match unrelated mentions depending on how the query is formed.
6.2 Volume interpretation¶
count_t is matching articles, not unique events, and may include duplication/syndication.
6.3 No-news days¶
Days with count_t = 0 are treated as missing measurements (tone and z are null), not neutral sentiment.