Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I have the following data

         feat_1    feat_2 ... feat_n   label

gene_1   100.33     10.2  ... 90.23    great

gene_2   13.32      87.9  ... 77.18    soso

....

gene_m   213.32     63.2  ... 12.23    quitegood

The size of M is large ~30K rows, and N is much smaller ~10 columns. My question is what is the appropriate Deep Learning structure to learn and test the data like above.

At the end of the day, the user will give a vector of genes with expression.

gene_1   989.00

gene_2   77.10

...

gene_N   100.10

And the system will label which label does each gene apply e.g. great or soso, etc...

By structure I mean one of these:

  • Convolutional Neural Network (CNN)
  • Autoencoder
  • Deep Belief Network (DBN)
  • Restricted Boltzman Machine

1 Answer

0 votes
by (33.1k points)

You might want to see a bit more about how these networks work. 

For example:

import numpy as np

from sklearn import preprocessing

from keras.models import Sequential

from keras.layers.core import Dense, Activation, Dropout

# Create some random data

np.random.seed(42)

X = np.random.random((10, 50))

# Similar labels

labels = ['good', 'bad', 'soso', 'amazeballs', 'good']

labels += labels

labels = np.array(labels)

np.random.shuffle(labels)

# Change the labels to the required format

numericalLabels = preprocessing.LabelEncoder().fit_transform(labels)

numericalLabels = numericalLabels.reshape(-1, 1)

y =  preprocessing.OneHotEncoder(sparse=False).fit_transform(numericalLabels)

# Simple Keras model builder

def buildModel(nFeatures, nClasses, nLayers=3, nNeurons=10, dropout=0.2):

    model = Sequential()

    model.add(Dense(nNeurons, input_dim=nFeatures))

    model.add(Activation('sigmoid'))

    model.add(Dropout(dropout))

    for i in xrange(nLayers-1):

        model.add(Dense(nNeurons))

        model.add(Activation('sigmoid'))

        model.add(Dropout(dropout))

    model.add(Dense(nClasses))

    model.add(Activation('softmax'))

    model.compile(loss='categorical_crossentropy', optimizer='sgd')

    return model

for nLayers in xrange(2, 4):

    for nNeurons in xrange(5, 8):

        model = buildModel(X.shape[1], y.shape[1],              nLayers, nNeurons)

        modelHist = model.fit(X, y, batch_size=32,              nb_epoch=10,

        validation_split=0.3, shuffle=True, verbose=0)

        minLoss = min(modelHist.history['val_loss'])

        epochNum =                                              modelHist.history['val_loss'].index(minLoss)

        print({0} layers, {1} neurons best validation          at'.format(nLayers, nNeurons))

        print 'epoch {0} loss =                                {1:.2f}'.format(epochNum, minLoss)

Output:

2 layers, 5 neurons best validation at epoch 0 loss = 1.18

2 layers, 6 neurons best validation at epoch 0 loss = 1.21

2 layers, 7 neurons best validation at epoch 8 loss = 1.49

3 layers, 5 neurons best validation at epoch 9 loss = 1.83

3 layers, 6 neurons best validation at epoch 9 loss = 1.91

3 layers, 7 neurons best validation at epoch 9 loss = 1.65

Hope this answer helps you! For more insights on this, study the Machine Learning Online Course. Also, go through the Deep Learning Tutorial for mored details on this.

Browse Categories

...