Participation Eligibility

Participants must represent a USA-based company with more than 20 employees*.
The Challenge is open to individuals or teams of up to 6 people. Teams will designate a Team Leader, who will serve as the primary point of contact.
Participants must be able to commit 10-15 hours across 1-2 weeks in February and March 2026 to the Challenge.
Final eligibility shall be determined after application evaluation, at the discretion of the Organizer.

*Practitioners at companies based outside of the US may also register; we will consider their participation in the Challenge on a case-by-case basis

Use Case & Data Provision Requirements

Participants can select any binary supervised machine learning (ML) use case and model on tabular data that has been in production for at least thirty 30 days.
Data tables must reside in one of Databricks, Snowflake, or BigQuery platforms.
Participants must provide access to normalized data related to the use case or a functional equivalent. All data should adhere to general relational database model requirements.
A data dictionary should be provided if available.
Participants are required to provide multiple linkable data tables (at least 4 and up to 20 tables).
Views are acceptable if used to remove Personally Identifiable Information (PII) or for data cleaning.
Acceptable table types include Dimension tables, Slowly Changing Dimension (SCD) tables, Event tables, Time Series tables, and Snapshot tables.
Tables should only include structured data—no data containing images, PDFs, or nested JSON files.
Data tables containing time-based records (SCD, Event, Time Series, Snapshot) must include a time column with a consistent format and a primary key column.
Dimension tables must contain a unique primary key column.
Teams must be available to answer questions on their data to help FeatureByte develop a basic understanding of the data, schema, and relationships.

Data Access Requirements

FeatureByte must receive read access to all the tables in the dataset.
Write access must be provided to a dedicated schema in the data platform for storing FeatureByte-generated files and tables.
A service account must be created in the data platform for FeatureByte to read and write data.
Appropriately sized cluster to handle the data volume on which computations will be performed. Minimum requirements are:
- Snowflake Warehouse: X-Small
- Databricks SQL Warehouse: Small
- BigQuery: NA

Evaluation Criteria

Model evaluation is based on predicted probabilities for the test dataset.
Model performance improvement is considered meaningful if the AUC of the Participant’s model is at least 0.01 higher than the model created using FeatureByte. I.e. (AUC_Participant-AUC_FeatureByte) > 0.01

Evaluation Rules

Participant's business-production model must have been in production for at least 30 days.
Participant must provide a training observation table containing the target, ideally the same or very similar to the one used for training their model
Participants must provide the partition schema with targets that was used to train and test their model (including test and holdout tables, if a holdout is available) and their model's AUC for the test/holdout data.
At the evaluation stage, participants must submit predicted probabilities generated by their model, as well as target values for the test dataset.
To claim the prize, participants must provide a list of features used in their model and indicate feature importance.
Final determination of prize eligibility will be made by a panel of independent judges.

Days
Hours
Minutes

Deadline to register January 30, 2026

Are You Ready?

Think your production models can beat the FeatureByte platform? Now’s your chance to prove it — on real data, in real-world conditions, with real rewards.

Join the Challenge FAQs

Challenge Rules

Are You Ready?