KAGGLE (leaderboard results)
Winners prize #1 (first place, verified) 500 USD and 1000 USD travel award + Award certificate
AAGV (Verified by Javier)
The code from the winning team AAAGV, which is publicly available on https://github.com/asutera/kaggle-connectomics, was run successfully on a desktop PC, it used 7 GB of RAM and it took 30h to run in single core mode on a 3 GHZ i7 CPU for each dataset.
The code is built in python and only uses standard dependencies. There was a issue with a specific library version but this has been resolved. Also you only need to run 1 script for the whole computation (main.py). From the valid dataset I obtained an AUC of 0.9426
and for the valid dataset and 0.9416 for the test dataset, which are the same as the ones reported in Kaggle.
Winners prize #2 (third place, verified) 250 USD and 750 USD travel award + Award certificate
Ildefons (Verified by Javier, Mehreen, and Bisakha)
Ildefons code, which is publicly available here https://github.com/ildefons/connectomics consisted of 6 separate scripts. The following are the time and memory requirements for each of the scripts. The main challenges were installing the required R package gbm and his script makeFeatures.R which needed 128 G. This R script started a MATLAB server in the SGE (Sun Grid Engine) background. I had to execute makeFeatures.R separately for normal-1, normal-2, valid, and test. His code was executed on our standard compute nodes on the cluster. The compute nodes have 2 INTEL CPUs, 16 processing cores, and 128 GB RAM.
The code passed verification successfully. His AUC for the Kaggle submission generated by us is 0.94066. This is better than his leader board score of 0.93900. The difference between the two scores is 0.00166.
Winners prize #3 (fourth place, verified) 100 USD and 400 USD travel award + Award certificate
Lukasz (Verified by Javier and Bisakha)
The code of this team is found at: https://github.com/lr292358/connectomics. The following are the details for Lukasz’s code. His code required executing a Python script run.py with different seeds. After that, mergeNormalize.py was used to average the outputs from different seeds. All of his code passed verification successfully.
The bottlenecks were installing theano (Python module) on the GPU units and gaining access to the GPU units. We have 5 cluster nodes with GPU accelerators. Each node has 1 accelerator. Each GPU has 2496 cores. The accelerator is NVIDIA Tesla Kepler (K20). The compute nodes have 2 INTEL CPUs, 16 processing cores, and 128 GB RAM.
This work has utilized computing resources at the High Performance Computing Facility of the Center for Health Informatics and Bioinformatics at the NYU Langone Medical Center.