Forums  > Pricing & Modelling  > Two needles in a haystack  
     
Page 1 of 1
Display using:  

bullero


Total Posts: 57
Joined: Feb 2018
 
Posted: 2019-08-15 22:07
Looking for someone who is experienced in dealing with extremely imbalanced data sets and classification.

nikol


Total Posts: 838
Joined: Jun 2005
 
Posted: 2019-08-17 12:49
Is it about very low S/B ratio?

Maybe try these?
https://www.freelancer.com.ru/jobs/machine-learning/

EspressoLover


Total Posts: 384
Joined: Jan 2015
 
Posted: 2019-08-18 03:01
Sounds pretty similar to classic anomaly detection. In which case your best bet for someone with practical experience is fraud detection at a transaction processor.

Good questions outrank easy answers. -Paul Samuelson

sharpe_machine


Total Posts: 30
Joined: Feb 2018
 
Posted: 2019-08-18 13:42
I recommend you to surf through Kaggle Kernels/Winners solutions. You can grab all the major tricks from them. Possible keywords: CTR prediction, Fraud detection, Spam detection, etc.

Example:
https://www.kaggle.com/janiobachmann/credit-fraud-dealing-with-imbalanced-datasets
https://www.kaggle.com/c/avazu-ctr-prediction/overview
https://www.kaggle.com/c/criteo-display-ad-challenge/overview

TonyC
Nuclear Energy Trader

Total Posts: 1316
Joined: May 2004
 
Posted: 2019-08-18 15:56
I found these to be good intro's to the subject of anomaly detection, but have no insight as to where to find guys that are very experienced at it

Ted Dunning, "a new look at anomaly detection"
https://mapr.com/practical-machine-learning-new-look-anomaly-detection/

Also Rob hyndman's "oddstream" R package
Https://robjhyndman.com/publications/oddstream/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+ProfessorRobJHyndman+%28Professor+Rob+J+Hyndman%29

flaneur/boulevardier/remittance man/energy trader

bullero


Total Posts: 57
Joined: Feb 2018
 
Posted: 2019-08-19 06:01
A big thank to everyone Worship

gmetric_Flow


Total Posts: 27
Joined: Oct 2016
 
Posted: 2019-08-20 14:38
Maybe you find this paper from Uber useful or at the very least, interesting?
https://pdfs.semanticscholar.org/a9b0/77367a8ca4cfa8425db31dd339673ddf1579.pdf

bullero


Total Posts: 57
Joined: Feb 2018
 
Posted: 2019-08-20 22:08
@gmetric_Flow thanks!

So you run two LSTMs in parallel. The first one is hourglass-shaped such that a large amount of information flows through a small number of nodes. So the weights there are ~ eigenvectors of feature correlations?

nikol


Total Posts: 838
Joined: Jun 2005
 
Posted: 2019-08-21 19:30
In the same direction
https://towardsdatascience.com/extreme-rare-event-classification-using-autoencoders-in-keras-a565b386f098

ADDED:
Search of ETI
https://towardsdatascience.com/ai-powered-search-for-extra-terrestrial-intelligence-signal-classification-with-deep-learning-6c09de8fd57c

Rashomon


Total Posts: 208
Joined: Mar 2011
 
Posted: 2019-08-21 20:13
Watch out for Hyndman. Culurciello is quality.

bullero


Total Posts: 57
Joined: Feb 2018
 
Posted: 2019-08-22 12:32
Thanks!
Previous Thread :: Next Thread 
Page 1 of 1