`%tensorflow_version` only switches the major version: `1.x` or `2.x`.
You set: `2.x # Colab only.`. This will be interpreted as: `2.x`.
TensorFlow 2.x selected.
2.0.0-beta1
# split up the datadf_train,df_test,Ytrain,Ytest=train_test_split(df['data'],Y,test_size=0.33)
# Convert sentences to sequencesMAX_VOCAB_SIZE=20000tokenizer=Tokenizer(num_words=MAX_VOCAB_SIZE)tokenizer.fit_on_texts(df_train)sequences_train=tokenizer.texts_to_sequences(df_train)sequences_test=tokenizer.texts_to_sequences(df_test)
# get word -> integer mappingword2idx=tokenizer.word_indexV=len(word2idx)print('Found %s unique tokens.'%V)
Found 7309 unique tokens.
# pad sequences so that we get a N x T matrixdata_train=pad_sequences(sequences_train)print('Shape of data train tensor:',data_train.shape)# get sequence lengthT=data_train.shape[1]
Shape of data train tensor: (3733, 189)
data_test=pad_sequences(sequences_test,maxlen=T)print('Shape of data test tensor:',data_test.shape)
Shape of data test tensor: (1839, 189)
# Create the model# We get to choose embedding dimensionalityD=20# Hidden state dimensionalityM=15# Note: we actually want to the size of the embedding to (V + 1) x D,# because the first index starts from 1 and not 0.# Thus, if the final index of the embedding matrix is V,# then it actually must have size V + 1.i=Input(shape=(T,))x=Embedding(V+1,D)(i)x=LSTM(M,return_sequences=True)(x)x=GlobalMaxPooling1D()(x)x=Dense(1,activation='sigmoid')(x)model=Model(i,x)
# Compile and fitmodel.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])print('Training model...')r=model.fit(data_train,Ytrain,epochs=10,validation_data=(data_test,Ytest))
# Plot loss per iterationimportmatplotlib.pyplotaspltplt.plot(r.history['loss'],label='loss')plt.plot(r.history['val_loss'],label='val_loss')plt.legend()
<matplotlib.legend.Legend at 0x7f0cb026bef0>
# Plot accuracy per iterationplt.plot(r.history['accuracy'],label='acc')plt.plot(r.history['val_accuracy'],label='val_acc')plt.legend()