メニューを開く トップに戻る トップに戻る

Data Release

大学など公共の研究機関・研究者が研究目的で利用できるよう、ZOZOTOWN・WEARなどから取得されたデータを公開します

Open Bandit Dataset

Open Bandit Dataset is a public real-world logged bandit feedback data. The dataset is provided by ZOZO, Inc., the largest Japanese fashion e-commerce company with over 5 billion USD market capitalization (as of May 2020). The company uses multi-armed bandit algorithms to recommend fashion items to users in a large-scale fashion e-commerce platform called ZOZOTOWN.

This dataset is released along with the paper:

Yuta Saito, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita.
Large-scale Open Dataset, Pipeline, and Benchmark for Bandit Algorithms https://arxiv.org/abs/2008.07146

When using this dataset, please cite the paper with following bibtex:

@article{saito2020large,
    title={Large-scale Open Dataset, Pipeline, and Benchmark for Bandit Algorithms},
    author={Saito, Yuta, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita},
    journal={arXiv preprint arXiv:2008.07146},
    year={2020}
 }

Data description

Open Bandit Dataset is constructed in an A/B test of two multi-armed bandit policies in a large-scale fashion e-commerce platform, ZOZOTOWN. It currently consists of a total of 26M rows, each one representing a user impression with some feature values, selected items as actions, true propensity scores, and click indicators as an outcome. This is especially suitable for evaluating off-policy evaluation (OPE), which attempts to estimate the counterfactual performance of hypothetical algorithms using data generated by a different algorithm in use.

Fields

Here is a detailed description of the fields (they are comma-separated in the CSV files): {behavior_policy}/{campaign}.csv (behavior_policy in (bts, random), campaign in (all, men, women))

  • timestamp: timestamps of impressions.
  • item_id: index of items as arms (index ranges from 0-80 in "All" campaign, 0-33 for "Men" campaign, and 0-46 "Women" campaign).
  • position: the position of an item being recommended (1, 2, or 3 correspond to left, center, and right position of the ZOZOTOWN recommendation interface, respectively).
  • click: target variable that indicates if an item was clicked (1) or not (0).
  • propensity_score: the probability of an item being recommended at each position.
  • user feature 0-4: user-related feature values.
  • user-item affinity 0-: user-item affinity scores induced by the number of past clicks observed between each user-item pair.

item_context.csv

  • item_id: index of items as arms (index ranges from 0-80 in "All" campaign, 0-33 for "Men" campaign, and 0-46 "Women" campaign).
  • item feature 0-3: item related feature values

Please visit the examples to learn how to use the data.

Google Group

The whole project is on-going. The project team plans to expand the data and release the new versions of the dataset in the near future. If you are interested, then you can follow the updates at out google group: https://groups.google.com/g/open-bandit-project

Contact

For any question, feel free to contact:

The authors of the paper: saito@hanjuku-kaso.com
ZOZO Research: zozo-research@zozo.com

Download