Skip to content

GDELT Media Tone Index

This page explains how InvestLens retrieves GDELT media tone, converts it into a 30-day standardized Media Tone Index (z-score), and visualizes it.

1. What we get from GDELT

We request two daily timeline series for the same query and date range:

  • TimelineTone (daily tone)

    • Returns a daily “Average Tone” value for the query’s matching news coverage (the average tone across all matching articles within that day’s bucket).
    • In our data model, this becomes tone_t for day t.
  • TimelineVolRaw (daily volume)

    • Returns the raw number of matching articles per day, count_t.

2. What “tone” means in this context

2.1 GDELT “Average Tone”

GDELT’s “Average Tone” is a daily aggregate tone score computed from the language in matched news coverage for the query.

  • Positive values indicate more positive language on average.
  • Negative values indicate more negative language on average.
  • Values closer to 0 indicate more neutral or balanced language on average.

InvestLens uses the provider’s tone values as received and standardizes them over a rolling window.

3. How we compute the 30-day z-score series

We compute z-scores over a fixed 30-day UTC window per query.

3.1 Build the merged daily series

For each day t in the window, we merge:

  • tone_t from TimelineTone
  • count_t from TimelineVolRaw

3.2 Choose which days are used for the baseline statistics

  • A day with count_t = 0 is treated as “no measurement,” not “neutral tone.”
  • Baseline statistics are computed using only days with count_t > 0.

3.3 Compute baseline mean and standard deviation (volume-aware)

Because TimelineTone is a daily average, we apply volume-aware weighting so higher-volume days influence the baseline more.

We define weights:

\[ w_t = \log(1 + \text{count}_t) \]

We compute the volume-weighted mean:

\[ \mu = \frac{\sum_t w_t \cdot \text{tone}_t}{\sum_t w_t} \]

We compute the volume-weighted standard deviation (population-style):

\[ \sigma = \sqrt{\frac{\sum_t w_t \cdot (\text{tone}_t - \mu)^2}{\sum_t w_t}} \]

If total weight is effectively zero, we fall back to an unweighted mean and standard deviation over valid days (count_t > 0).

3.4 Compute daily z-scores

For each day with a valid measurement (count_t > 0):

\[ z_t = \frac{\text{tone}_t - \mu}{\sigma} \]
  • If \(\sigma = 0\), we return \(z_t = 0\).
  • For days where count_t = 0, we return tone = null and z = null so the frontend treats them as missing.

3.5 What “baseline” means

“Baseline” refers to the reference distribution used to standardize tone within the last 30 days (using only measured days):

  • baseline mean: \(\mu\)
  • baseline standard deviation: \(\sigma\)

A value of \(z_t = 0\) means \(\text{tone}_t = \mu\) for the window (volume-weighted when weighting is active).

4. What we visualize in the UI

InvestLens renders a range band and a “latest” marker. It does not plot the full daily time series in this widget.

4.1 Bar coordinate system and the z = 0 reference

The UI maps z-scores onto a 0–100 bar using a fixed transformation with clamping:

  • \(z = 0\) is always mapped to the center of the bar (50%).
  • Negative z values map left, positive z values map right.
  • Z-scores are clamped (e.g., to [-3, 3]) so extreme values stay on the bar.

This means the center of the bar is always the “baseline mean” reference point in z-score space (because z is defined as tone minus mean, divided by standard deviation).

4.2 Range band

The shaded band is computed from the observed z-scores in the 30-day window:

  • left edge = minimum observed z in the window (after display clamping and mapping)
  • right edge = maximum observed z in the window (after display clamping and mapping)

4.3 Latest marker

  • The marker is the most recent day’s z-score if it exists.
  • If the most recent day has z = null (no measurement), the frontend should use the most recent earlier day with a valid z-score.

5. Interpretation

  • Marker near the center (z close to 0): today’s measured tone is close to the 30-day baseline mean.
  • Marker far left (large negative z): unusually negative relative to the last 30 measured days.
  • Marker far right (large positive z): unusually positive relative to the last 30 measured days.

6. Limitations

6.1 Query ambiguity

Ticker/name queries can match unrelated mentions depending on how the query is formed.

6.2 Volume interpretation

count_t is matching articles, not unique events, and may include duplication/syndication.

6.3 No-news days

Days with count_t = 0 are treated as missing measurements (tone and z are null), not neutral sentiment.