2020.12.25

Factorization Analysis of ransportation Data

Research and Development by

Itsuki Noda, AIST

Corresponding Research Area

Simulation and designing countermeasures against possible COVID-19 resurgence: predicting spreading of infection, estimating and verifying the effectiveness of countermeasures, and predicting deman


Summary

  • Transportation data during Jan.-Oct., 2020 can be represented as a mixture of 12 basic patterns.
    • Each basic pattern is a combination of the following spacial and temporal patterns.
      • Spacial: “among neighbor prefectures” and “between metropolitan and nation-wide areas”.
      • Temporal: “weekday” and “holiday”.
  • Correlation analysis between weekly average of effective reproduction number (Rt) and the 12 basic patterns after mid. of March tells:
    • Rt of Hokkaido has week correlation with 3 basic patterns.
      • Need simulation analysis to confirm dependency and controllability.

Procedure of Analysis

  • Consider travel data among prefectures as a 3D tensor whose axes are origin, destination, and date.
    • Data source: Blogwatcher, Inc.
    • Inner-prefecture travel are not counted, because they include staying home.
    • The travel tensor is normalized as a probability tensor whose total is 1.
  • Apply non-negative tensor factorization to the travel tensor and get a mixture of basic patterns.
    • Each basic pattern is a direct product of an inter-prefecture matrix (spacial factor) and a day vector (temporal pattern).

  • Calculate correlation between changes of Rt and the day vector of each basic pattern.

BIC Analysis

  • Try to calculate BIC in order to determine the suitable number of basic patterns.
    • BIC becomes minimum at 12 basic patterns.

Acquired Temporal Pattern of Each Basic Patterns (1)

  • Plots of changes of probabilities of temporal patterns of 12 basic patterns (0th...11th).
    • The 0th pattern (among neighbor prefectures, weekday) is a major part of whole travel pattern.

Acquired Temporal Pattern of Each Basic Patterns (2)

  • Details of 1st ... 11th basic patterns.
    • The 4th and 7th patterns are relatively large.
      • The 4th pattern: among neighbor prefecture, holiday.
      • The 7th pattern: between metropolis and nation-wide.

Each Temporal Patterns (0th ... 5th)

  • types:
    • weekday: has narrow valleys at weekends.
    • holiday: has peeks at weekends.
    • pre-corona: Jan. and Feb. are larger than summer and autumn
    • post-corona: summer and autumn are larger than Jan. and Feb.
  • horizontal axis: day
    • Jan. 1st -- Oct. 30th, 2020.
  • vertical axis: prob.

Each Temporal Patterns (6th ... 11th)

  • types:
    • weekday: has narrow valleys at weekends.
    • holiday: has peeks at weekends.
    • pre-corona: Jan. and Feb. are larger than summer and autumn
    • post-corona: summer and autumn are larger than Jan. and Feb.
  • horizontal axis: day
    • Jan. 1st -- Oct. 30th, 2020.
  • vertical axis: prob.

Each Spacial Pattern(0th-5th)

  • lneighbor type: large at diagonal region.
  • metropolis nation-wide: horizontal and vertical bands.
  • nation-wide: spread widely.


  • horizontal: from pref.
  • vertical: to pref.
  • color: prob.  (scaled for each pattern)

Each Spacial Pattern(6th-11th)

 

  • lneighbor type: large at diagonal region.
  • metropolis nation-wide: horizontal and vertical bands.
  • nation-wide: spread widely.


  • horizontal: from pref.
  • vertical: to pref.
  • color: prob.  (scaled for each pattern)

Prefectures

  • 0: Hokkaido (representative of Hokkaido Area)
  • 1: Aomori
  • 2: Iwate
  • 3: Miyagi (representative of Tohoku Area)
  • 4: Akita
  • 5: Yamagata
  • 6: Fukushima
  • 7: Ibaraki
  • 8: Tochigi
  • 9: Gunma
  • 10: Saitama
  • 11: Chiba
  • 12: Tokyo (representative of Kanto Area)
  • 13: Kanagawa
  • 14: Yamanashi
  • 15: Nagano
  • 16: Niigata
  • 17: Toyama
  • 18: Ishikawa (representative of Hokuriku Area)
  • 19: Fukui
  • 20: Gifu
  • 21: Shizuoka
  • 22: Aichi (representative of Tokai Area)
  • 23: Mie
  • 24: Shiga
  • 25: Kyoto
  • 26: Osaka (representative of Kinki Area)
  • 27: Hyogo
  • 28: Nara
  • 29: Wakayama
  • 30: Tottori
  • 31: Shimane
  • 32: Okayama
  • 33: Hiroshima(representative of Chugoku Area) 
  • 34: Tokushima
  • 35: Kagawa
  • 36: Ehime (representative of Shikoku Area)
  • 37: Kochi
  • 38: Yamaguchi
  • 39: Fukuoka (representative of North Kyushu Area)
  • 40: Saga
  • 41: Nagasaki
  • 42: Kumamoto
  • 43: Oita
  • 44: Miyazaki
  • 45: Kagoshima (representative of South Kyushu Area)
  • 46: Okinawa (representative of Okinawa Area)

Correlation Matrix between Basic Patterns and Rt.

  • Period: mid of Mar. 〜 Oct.
    • Use weekly average of Rt with 8 days delay.
  • Rt of Hokkaido has week correlations with f03, f06 and f07 patterns.
    • Each pattern is between metropolis and nation-wide on weekday.

Scatter Plot

Scatter of f06(horizontal) and Rt(vertical)

Scatter of f07(horizontal) and Rt(vertical)

  • Scatter Plots between Rt of Hokkaido and f06, f07
    • week correlations
    • colors indicate the number of week.
      • The correlation with f06 becomes weeker in later weeks.

(ref.) Correlation Matrix between Basic Patterns and Rt.

  • using Rt on-time (no delay)
    • correlations are relatively low.