Data

Data page description

Dataset

Arcstar provides its data scientist community free curated, high quality and obfuscated data.

There are 6 datasets in the Tournament.

X_train

/data/X_train.csv

y_train

/data/y_train.csv

X_test

/data/X_test.csv

ID

Each id in X_train and X_test corresponds to a stock at a specific time Moons.

Moons

The frequency of the Moons depends on the dataset :

gordon-geeko : 30 days interval between each moon
dolly : 90 days interval
e-kinetic : 7 days interval
c-mechanics : 7 days interval
b-volatility : 7 days interval
3b1-signal : 7 days interval

Features

The features describe specific attributes of a stock at a point in time.

Targets

The y_train file contains 3 targets target_r, target_g, target_b that correspond to the idiosyncratic return of the stock over 3 time horizons : 30, 60 and 90 days respectively.

Split

The overall dataset is splitted in two : train and test. The test set start one moon after the last moon of X_train.

Files might be big (200+MB) so make sure to have enough space before downloading.

PreviousGetting Started NextModel

Last updated 2 years ago

Dataset

ID

Moons

Features

Targets

Split

​